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Abstract 

This paper investigates the minimum energy required to transmit k information bits with a given 
reliability over a multiple-antenna Rayleigh block-fading channel, with and without channel state in¬ 
formation (CSI) at the receiver. No feedback is assumed. It is well known that the ratio between the 
minimum energy per bit and the noise level converges to —1.59 dB as k goes to infinity, regardless 
of whether CSI is available at the receiver or not. This paper shows that lack of CSI at the receiver 
causes a slowdown in the speed of convergence to — 1.59 dB as fc —> oo compared to the case of perfect 
receiver CSI. Specifically, we show that, in the no-CSI case, the gap to —1.59 dB is proportional to 
((log whereas when perfect CSI is available at the receiver, this gap is proportional to Xjyfk. 

In both cases, the gap to —1.59 dB is independent of the number of transmit antennas and of the 
channel’s coherence time. Numerically, we observe that, when the receiver is equipped with a single 
antenna, to achieve an energy per bit of —1.5 dB in the no-CSI case, one needs to transmit at least 
7 X 10^ information bits, whereas 6 x 10"^ bits suffice for the case of perfect CSI at the receiver. 
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I. Introduction 

A classic result in information theory is that, for a wide range of ehannels ineluding AWGN 
ehannels and fading ehannels, the minimum energy per bit required for reliable eommunie- 
ation satisfies [1], [2] 

^ =log,2 = -1.59dB. (1) 

'0 min 

Here, Nq is the noise power per eomplex degree of freedom. For fading ehannels, (1) holds 
regardless of whether the instantaneous fading realizations are known to the reeeiver or not [2, 
Th. 1], [3].i 

The expression in (1) is asymptotic in several aspeets: 

• the bloeklength n of each codeword is infinite; 

• the number of information bits k, or equivalently, the number of messages M = 2^ is 
infinite; 

• the error probability e vanishes; 

• the total energy E is infinite; 

• E/n vanishes. 

For many channels, the limit in (1) does not ehange if we allow the error probability to be 
positive. However, keeping any of the other parameters fixed results in a baekoff from (1) [2], 
[4]-[8]. 

In this paper, we study the maximum number of information bits k that can be transmitted 
with a finite energy E and a fixed error probability e > 0 over a multiple-input multiple- 
output (MIMO) Rayleigh bloek-fading ehannel, when there is no eonstraint on the bloeklength n. 
Equivalently, we determine the minimum energy E required to transmit k information bits with 
error probability e. We eonsider two seenarios: 

1) neither the transmitter nor the reeeiver have a priori channel state information (CSI); 

2) perfeet CSI is available at the reeeiver (CSIR) and no CSI is available at the transmitter. 
Throughout the paper, we shall refer to these two seenarios as no-CSI ease and perfeet-CSIR 
case, respectively. 

'Knowledge of the fading realizations at the transmitter may improve (1), because it enables the transmitter to signal on the 
channel maximum-eigenvalue eigenspace [2]. 
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Related work: For nonfading AWGN channels with unlimited bloeklength, Polyanskiy, Poor, 
and Verdu [8] showed that the maximum number of eodewords M*{E, e) that ean be transmitted 
with energy E and error probability e satisfies^ 



Here, Q ^(■) denotes the inverse of the Gaussian Q-funetion. The first term on the right-hand 


side (RHS) of (2) gives the —1.59 dB limit. The seeond term eaptures the penalty due to the 
stoehastie variations of the ehannel. This term plays the same role as the channel dispersion in 
finite-blocklength analyses [7], [9]. In terms of the minimum energy per bit El{k,e) neeessary 
to transmit k bits with error probability e, (2) implies that, for large E, 



(3) 


i.e., that the gap to —1.59 dB is proportional to l/\/k. The asymptotie expansion (2) is estab¬ 
lished in [8] by showing that in the limit E ^ oo a nonasymptotie aehievability bound and a 
nonasymptotie eonverse bound mateh up to third order. The aehievability bound is obtained by 
eomputing the error probability under maximum-likelihood decoding of a codebook eonsisting of 
M orthogonal eodewords (e.g., uneoded M-ary pulse-position modulation (PPM)). The eonverse 
bound follows from the meta-converse theorem [7, Th. 27] with auxiliary distribution ehosen 
equal to the noise distribution. Kostina, Polyanskiy, and Verdu [10] generalized (2) to the 
setting of joint souree and ehannel eoding, and eharaeterized the minimum energy required 
to reproduce k souree samples with a given fidelity after transmission over an AWGN channel. 

Moving to fading ehannels, for the case of no CSl, flash signalling [2, Def. 2] (i.e., peaky sig¬ 
nals) must be used to reaeh the —1.59 dB limit [2]. In the presenee of a peak-power eonstraint, (1) 
can not be achieved [11]-[14]. Verdu [2] studied the rate of eonvergenee of the minimum energy 
per bit to —1.59 dB as the speetral efficieney vanishes. He showed that, differently from the 
perfeet-CSIR ease, in the no-CSI ease the —1.59 dB limit is approaehed with zero wideband 
slope. Namely, the slope of the speetral-effieiency versus energy-per-bit funetion at —1.59 dB 
is zero. This implies that operating close to the —1.59 dB limit is very expensive in terms of 
bandwidth in the no-CSI ease. For the seenario of finite bloeklength n, fixed energy budget E, 

^Unless otherwise indicated, the log and the exp functions are taken with respect to an arbitrary fixed base. 
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and fixed probability of error e, bounds and approximations on the maximum channel coding 
rate over fading channels (under various CSI assumptions) are reported in [15]-[21]. 

Contributions: Focusing on the regime of unlimited blocklength, but finite energy E, and 
finite error probability e, we provide upper and lower bounds on the maximum number of 
codewords M*{E,e) that can be transmitted over an rrit x nij. MIMO Rayleigh block-fading 
channel with channel’s coherence interval of Uc symbols. For the no-CSI case, we show that for 
every e G (0,1/2) 


log M* {E, e) = log e - Vc 

iVn 


rrirE 


Nn 


Q-\e) 


2/3 


log—j +0 


:E^/HoglogE\ 

( (logi5)V3 )■ 

E —)■ oo (4) 


where 


l/o= 12-1/='+ - 


2\l/3 


(loge 


,2/3 


(5) 


Note that the asymptotic expansion (4) does not depend on the number of transmit antennas mt 
and the channel’s coherence interval ric. The fact that the first term does not depend on ric and 
mt follows directly from [2, Eq. (52)] by noting that an mt x block-fading MIMO channel 
with coherence interval Uc is equivalent to an mtUc x mj-Uc memoryless MIMO fading channel 
with block-diagonal channel matrix [2, p. 1339]. Our result (4) shows that the same holds for 
the second term in the expansion of log M*{E, e) for E —)■ cx). In terms of minimum (received) 
energy per bit E^{k, e), (4) implies that, for large E,^ 


E*dk,e) 

Nn 


loge 2 + Eo ■ 


logefe 

k 


1/3 




( 6 ) 


i.e., the gap to —1.59 dB is proportional to ((logg fc)/A;)i/^. 

We establish (4) by analyzing in the limit E ^ oo an achievability bound and a converse 
bound. The achievability bound follows from a nonasymptotic extension of Verdu’s capacity-per- 
unit-cost achievability scheme [5, pp. 1023-1024]. This scheme relies on a codebook consisting 
of the concatenation of uncoded PPM and a repetition code, and on a decoder that performs 
binary hypothesis testing. The converse bound relies on the meta-converse theorem [7, Th. 31] 
with auxiliary distribution chosen as in the AWGN case. The resulting bound involves an 


^By considering the received energy per bit instead of the transmit energy per bit, we account for the array gain resulting 
from the use of multiple receive antennas. 
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optimization over the infinite-dimensional spaee of input codewords (recall that in our setup there 
is no constraint on the blocklength n). By exploiting the Gaussianity of the fading process, we 
show that this infinite-dimensional optimization problem can be reduced to a three-dimensional 
one. The tools needed to establish this result are the ones developed by Abbe, Huang, and 
Telatar [22] to prove Telatar’s minimum outage probability conjecture for multiple-input single¬ 
output (MISO) Rayleigh-fading channels. Indeed, both problems involve the optimization of 
quantiles of a weighted convolution of exponential distributions. 

The asymptotic analysis of achievability and converse bounds reveals the following tension: 
on the one hand, one would like to make the codewords peaky to overcome lack of channel 
knowledge; on the other hand, one would like to spread the energy of the codewords uniformly 
over multiple coherence intervals to mitigate the stochastic variations in the received signal 
energy due to the fading. 

For the case of perfect CSIR, we prove that for every e G (0,1/2) 

\ogM*{E,e) = ^^loge- loge + ^ log + C>(VlogE), E ^ oo. (7) 

Note that the asymptotic expansion (7) is also independent of the number of transmit antennas 
rrit and the channel’s coherence interval ric- Furthermore, apart from an energy normalization 
resulting from the array gain, this asymptotic expansion coincides with the one given in (2) for 
the AWGN case up to a C>(A/log £') term. In terms of minimum (received) energy per bit, (7) 
implies that (3) holds also for the perfect-CSIR case. 

To establish (7), we show that every code for the AWGN channel can be transformed into a 
code for the MIMO block-memoryless Rayleigh-fading channel having the same probability of 
error. This is achieved by concatenating the AWGN code with a rate 1/N repetition code, by 
performing maximum ratio combining at the receiver, and then by letting 27 —)■ oo. We obtain a 
converse bound that matches the achievability bound up to third order as 72 —)■ oo by using again 
the meta-converse theorem and then by optimizing over all input codewords. The asymptotic 
analysis of the converse bound reveals that spreading the energy of the codewords uniformly 
across many coherence intervals is necessary to mitigate the stochastic variations in the energy 
of the received signal due to fading. 

In both the no-CSI and the perfect-CSIR case, the asymptotic analysis of the achievability 
bound is based on a standard application of Berry-Esseen central-limit theorem (see, e.g., [23, 
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Ch. XVL5]). The asymptotic analysis of the converse part in both cases is not as straightforward. 
The main difficulty is that, unlike for discrete memory less channels and AWGN channels, we 
can not directly invoke the central-limit theorem to evaluate the information density, because 
the central-limit theorem may not hold if the energy of a codeword is concentrated on few of 
its symbols. To solve this problem, we develop new tools that rely explicitly on the Gaussianity 
of the fading process. Specifically, for the no-CSI case, we exploit the log-concavity of the 
information density to lower-bound its cumulative distribution function (cdf). The resulting bound 
allows us to eliminate the codewords for which the central-limit theorem does not apply. For 
the perfect-CSIR case, we show that the distribution of the information density is unimodal and 
right-skewed (i.e., its mean is greater than its mode). Using this result, we then prove that to 
optimize the cdf of the information density, it is necessary to reduce its “skewness”, thereby 
showing that the optimized information density must converge as —)■ oo to a (non-skewed) 
Gaussian distribution. 

By comparing (7) with (4), we see that, although the minimum (received) energy per bit 
approaches (1) as k increases regardless of whether CSIR is available or not, the convergence 
is slower for the no-CSI case. For the case mr = 1, our nonasymptotic bounds reveal that to 
achieve an energy per bit of —1.5 dB, one needs to transmit at least 7 x 10^ information bits in 
the no-CSI case, whereas 6 x 10"^ bits suffice in the perfect-CSIR case. Furthermore, the bounds 
also reveal that it takes 2 dB more of energy to transmit 1000 information bits in the no-CSI 
case compared to the perfect-CSIR case. As a possible application, our results may be relevant 
for the design of wireless sensor networks, where energy constraints are often more stringent 
than bandwidth constraints, and where data packets are usually short. 

Notation: Upper case letters such as X denote scalar random variables and their realizations 
are written in lower case, e.g., x. We use boldface upper case letters to denote random vectors, 
e.g., X, and boldface lower case letters for their realizations, e.g., x. Upper case letters of two 
special fonts are used to denote deterministic matrices (e.g., Y) and random matrices (e.g., Y). The 
symbol N denotes the set of natural numbers, and IR+ denotes the set of nonnegative real numbers. 
The superscripts ^ and ^ stand for transposition and Hermitian transposition, respectively, and “ 
stands for the complex conjugate. We use tr(A) and det(A) to denote the trace and determinant 
of the matrix A, respectively, and use ||A||p = ■\/tr(AA^) to designate the Frobenius norm of A. 
For an infinite-dimensional complex vector x E C°°, we use ||a;||p to denote the £p-norm of x. 
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i.e., ||a;||p = (The £oo-norm of x is defined as ||a;||oo — sup \xi\. We use ej to 

i 

denote the infinite dimensional veetor that has 1 in the jth entry and 0 elsewhere, and use 
to denote the identity matrix of size ax a. The distribution of a eireularly symmetrie Gaussian 
random veetor with eovarianee matrix A is denoted by C7\/^(0, A). We use Exp(/i) to denote the 
exponential distribution with mean /r, and use Gamma(a, h) to denote the Gamma distribution 
with shape parameter a and seale parameter b [24, Ch. 17]. For two funetions / and g, we 
use f -k g to denote the eonvolution of / and g. Furthermore, the notation f{x) = 0{g{x)), 
X ^ oo, means that limsup 2 ,_^j^|/(a;)/ 5 f(x)| < oo, and f{x) = o{g{x)), x ^ oo, means 
that \imx^oo\fix)/g{x)\ = 0. For two measures /i and u, we write /i C z/ if /i is absolutely 
eontinuous [25, p. 88] with respeet to u. Finally, 1-1’'" = max{0, •}. 

Next, we introduee two definitions related to the performanee of optimal hypothesis testing. 
Given two probability distributions P and Q on a eommon measurable spaee W, we define a 
randomized test between P and Q as a random transformation Pz\w ■ TV —)■ {0,1} where 0 
indieates that the test ehooses Q. We shall need the following performanee metrie for the test 
between P and Q: 


Pa{P,Q) 


— mm 

Pzlw-lPz\w(^MP(d'w)>a^ 


' Pz\w{M'>p)Qidw) 


where the minimum is over all probability distributions Pz \ w satisfying 


( 8 ) 


J Pz\w{^ I w)P{dw) > a. 


(9) 


The minimum in (8) is guaranteed to be aehieved by the Neyman-Pearson lemma [26]. For an 
arbitrary set P, we define the following performanee metrie for the eomposite hypothesis testing 
between Qy and the eolleetion {Py\x=x}x&P- 

KriP^Qy) ^ inf j Pz\y{l\y)Qy{dy). (10) 

Here, the infimum is over all eonditional probability distributions Pz\y : W —?• (0,1} satisfying 

inf [ Pz\Y{My)PY\x=x{dy) >T. (11) 

x&F J 


II. Problem Formulation 


A. Channel Model and Codes 

We eonsider a MIMO Rayleigh bloek-fading ehannel with irit transmit antennas and rur reeeive 
antennas that stays eonstant over a bloek of ric ehannel uses (eoherenee interval) and ehanges 
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independently from bloek to block. The channel input-output relation within the ith coherence 
interval is given by 

Vj = UjEIj + Zj. (12) 

Here, Uj G and V* G are the transmitted and received signals, respectively, 

expressed in matrix form; Hj G is the channel matrix, which is assumed to have i.i.d. 

CA/'(0,1) entries; Z* G is the additive noise matrix, also with i.i.d. CA/'(0, A^"o) entries. We 

assume that {Elj} and {Zj} are mutually independent, and take on independent realizations over 
successive coherence intervals (block-memoryless assumption). In the remainder of the paper, 
we shall set = 1, for notational convenience. 

We are interested in the scenario where the blocklength is unlimited, and we aim at char¬ 
acterizing the minimum energy required to transmit k information bits over the channel (12) 
with a given reliability. We shall use and V“ to denote the infinite sequences {Uj} and 
{Vj}, respectively. At times, we shall interpret as the infinite-dimensional matrix obtained 
by stacking the matrices {Uj}, i G N, on top of each other. In this case, the matrix has rrit 
columns and infinitely many rows, and its tth column vector represents the signal sent from the 
tth transmit antenna. The energy of the input matrix is measured as follows 

OO 

I|U“||"f = 5^||U.||^ (13) 

i=l 

Furthermore, we denote the set of all input matrices by A and the set of all output matrices 
V“ by B. Finally, we let 1-L be the set of channel matrices H“. 

Next, we define channel codes for the channel (12) for both the no-CSI and the perfect-CSIR 
case. 

Definition 1: An {E, M, e)-code for the channel (12) for the no-CSI case consists of a set of 
codewords {Ci,..., Cm} G satisfying the energy constraint 

\\Cfi\l<E, jG M} (14) 

and a decoder g : B ^ {1,M} satisfying the maximum error probability constraint 

max P[^(V“) = Cd < e. (15) 
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Here, Y°° is the output indueed by the eodeword I[J°° = Cj aeeording to (12). The maximum 
number of messages that ean be transmitted with energy E and maximum error probability e is 

M*{E,e) = max{M : 3 (E, M, e)-oode}. (16) 

Similarly, the minimum energy per bit is defined as 

E*{k, e) = l mf{E : 3 {E, 2^ e)-eode} . (17) 

K 

Definition 2: An (i7, M, e)-eode for the channel (12) for the perfect-CSIR case consists of 
a set of codewords {Ci,..., Cm} e satisfying the energy constraint (14), and a decoder 
g B X 1-L ^ {1,..., M} satisfying the maximum error probability constraint 

^max^^ P[^(V“, e“) ^ j I U“ = C^-] < e. (18) 

The maximum number of messages that can be transmitted with energy E and maximum error 
probability e for the perfect-CSIR case is defined as in (16). 

As we shall show in the next section, one can derive tight bounds on M*{E,e) (for both 
the no-CSI and the perfect-CSIR case) by focusing exclusively on the memoryless single-input 
multiple-output (SIMO) scenario Uc = rrit = 1. Therefore, we shall next develop a specific 
notation to address this setup. In the SIMO case, the input-output relation reduces to 

Vr,i = Hr,iUi + Zr,U T E {1, . . . , Trtr}, i EN . (19) 

Here, K,* E C denotes the received symbol at the rth receive antenna on the Ah channel use, 
and Hr^i and j denote the fading coefficient and the additive noise, respectively. We shall set 

u = [ui, U 2 ,...] and Vr = K, 2 , • • •]■ 

B. An Equivalent Channel Model for the no-CSI case 

Focusing on the no-CSI case, we define next a channel model that is equivalent to (19). 
Observe that, given U = u, the output vectors VI,..., Vn^ are i.i.d. Gaussian, i.e., 

OO 

Pvr\u=u = YlCN'{0,{l + \ui\‘^)), r e {1,... ,mr}. (20) 

i=l 

Since the {V} depend on the input symbols {«*} only through their squared magnitude 
we can reduce without loss of generality the input space to We also note that, given 
U = u, the joint conditional probability distribution of the random variables {V,i} in (19) does 


23rd May 2016 


DRAFT 


10 


not change if we multiply {K,i} with arbitrary deterministie phases. This means that the 
are a suffieient statisties for the deteetion of u from {K-}. Letting Xi = \ui\^ and 
r G i G N, we obtain the following input-output relation, whieh is equivalent 

to (19): 

Yr^i = {I + Xi)Sr,i, r G {1,... ,mr}, ieN. (21) 

Here, the input Xi and the output are nonnegative real numbers, and {Sr,i} are i.i.d. Exp(l)- 
distributed. We shall denote the input of the ehannel (21) by ai = [xi^ X2, ...] G ]R“ and denote 
the output by the matrix Y, whose entry on the rth row and the ith eolumn is F^ j. Sinee 
ll^lli = W'^Wl sinee ||ai||oo = we shall measure the energy and the peakiness of 

an input eodeword x for the ehannel (21) by its £i-norm ||ai||i, and by its foo-norm ||a;||oo, 
respeetively. 


III. Minimum Energy Per Bit 

We shall now eharaeterize M*{E,e) for both the no-CSI and the perfeet-CSIR ease. The 
organization of this seetion is as follows. In Seetion III-A, we first present nonasymptotie 
aehievability and eonverse bounds on M*{E, e) for general ehannels subjeet to a eost eonstraint. 
In Seetion III-B, we then partieularize these bounds to the ehannel (12) for the no-CSI ease. 
Both the eonverse and aehievability bounds in Seetion III-B are derived by redueing the MIMO 
ehannel (12) to the SIMO ehannel (21). We then show in Seetion III-C that these bounds mateh 
asymptotieally as E —)■ cx) up to seeond order, thus establishing (4). In Seetion III-D, we 
derive bounds on M*{n,e) for the perfeet-CSIR ease and prove the asymptotie expansion (7). 
Finally, the nonasymptotie bounds for both the no-CSI and the perfeet-CSIR ease are evaluated 
numerieally in Seetion III-E. 

A. General Nonasymptotie Bounds 

We eonsider in this seetion general stationary memoryless ehannels {X^PY\Xiy) with input 
eodewords subjeet to a eost eonstraint. As in [5], we use h[x\ to denote the eost of the symbol x 
in the input alphabet X. We shall also assume that there exists a zero-eost symbol, whieh we 
label as “0”. With a slight abuse of notation, we use E to denote the eost eonstraint imposed 
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on a codeword. An {E,M, e)-code for this general ehannel eonsists of a set of M eodewords 
Cj = [cj 4 , Cj, 2 , • ■ j = 1,..., M that satisfy the eost eonstraint 




( 22 ) 


2=1 


and has maximum error probability not exeeeding e. We next present two aehievability bounds 
on M*{E,e) that are finite-energy generalizations of Verdu’s lower bound [5, pp. 1023-1024] 
on the eapaeity per unit eost."^ 

Theorem 1: Consider a stationary memoryless ehannel {X, Py\x, 3^) that has a zero-eost input 
symbol. For every N eN, every 0 < e < 1, and every input symbol xq e X satisfying b[xo\ > 0, 
there exists an {E, M, e)-eode for whieh E = b[xo]N and 

T 


M - 1 > sup 


R ( p^-I^ 

0<T<e Pi-e+T{^Y\X=xo^ ^Y\X=Q) 




Here, /9(.)(-, ■) is given in (8), and 


'V 

N times 

for every x e X. 

Proof: As in [5], we ehoose the eodewords Cj G j = 1,..., M, as follows: 


(23) 


(24) 


(25) 


Cj = [0,...,0,Xo, ...,a:o,0,...]. 

{j-l)N N 

Fix an arbitrary r e (0,e). For a given reeeived signal Y G the deeoder runs M parallel 
binary hypothesis tests Zj, j = 1,..., M, between Py \ x=o and Py \ x=cy Here, Zj = 1 indieates 
that the test seleets Py\x=cj- The tests {Zj}, j = 1,... ,M, are ehosen to satisfy 


= 1 \ X = Cj\ > 1 — e + T 
= 11 = 0] = /?i_e+r (TV I x=cj , Py \ x=o) 


(26) 

(27) 


The existenee of tests that satisfy (26) and (27) is guaranteed by the Neyman-Pearson lemma [26]. 
The deeoder outputs the index m if Zm = 1 and Zj = 0 for all j m. It outputs 1 if no sueh 
index ean be found. 


'’For stationary memoryless channels, the capacity per unit cost is given by lim lim [log M* {E, e))/E. 

e—>-0 E^oo 
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By construction, the maximum probability of error of the eode just defined is upper-bounded by 


e < P[Zi = 0 I X = Cl] + (M - l)P[Zi = 11 X = 0] (28) 

< e — T + (M — l)/3i_e+T-(PVIx=ci5 -fV|x=o)- (29) 

Here, (28) follows beeause for eaeh test Zj (j ^ 1) satisfying (26) and (27), 

P[Zj = 1|X = Cl] = P[Zj = 1|X = 0] (30) 

= P[Zi = 1|X = 0] (31) 

and (29) follows by (26) and (27). From (29), we eonelude that 

- (P , ^ P , -)• 

Pl-e+T\J^Y\X=ci-i ^Y\X=0) 

The proof is eompleted by noting that 

Pi-.MPyix.c„Pyix.o) = Pi-,+AP?ix.„,P?ix.„) ( 33 ) 

and by maximizing the RHS of (32) over r G (0, e). ■ 


The proof of Theorem 1 is based on the same binary hypothesis-testing deeoder that is used 
in the proof of the k[5 bound [7, Th. 25]. In faet, if Py\x=xo ^ Py\x=o^ ^ slightly weakened 
version of (32), with M — 1 replaeed by M, follows direetly from the nfi bound [7, Th. 25] 
upon setting Qy = Py \ x=o and ehoosing the set P as 

= |ai G : X = [0,..., 0, Xq, ..., Xq, 0,...] for some j G n|. (34) 

(j-l)N N 

Sinee /9i-e+r(PV|x=a:, Qy) takes the same value for all x e P, to establish this looser bound 
it is suffleient to show that (proof omitted) 


Kr{P,QY) = T 


(35) 


where K(.)(-, •) is given in (10). 

Using the same eodebook as in Theorem 1 together with a maximum likelihood deeoder, we 
obtain a different aehievability bound, whieh is stated in the following theorem. 

Theorem 2: Consider a stationary memoryless ehannel (X, Py|x, 3^) that has a zero-eost input 
symbol. For every X G M, every 0 < e < 1, and every input symbol xq G X satisfying 6[xo] > 0, 
there exists an (P, M, e)-eode for whieh E = b[xo]N and 


e < E 


mm 


1,(M-1)P tMixo-,Y^)<iNixo;Y^^)\Y 


rN\ 


rN 


(36) 


23rd May 2016 


DRAFT 







13 


Here, 




N \ jd®N 


and 


jp®N 

, ^^^ A 1 ^^Y\X=x , N\ 

tN{x]y ) = log jv [y ) 
y|x=o 


(37) 


with X e X, defined in (24). 

Remark 1: For AWGN ehannels with eost funetion b[x] = x^, one ean reeover [8, Eq. (15)] 
from (36) by setting iV = 1 and xq = a/E. 

Proof: We use the same eodebook as in Theorem 1, together with a maximum likelihood 
deeoder. Let 


/ N A , dPY\X=x, X 
t{x,y) = log—- [y). 


dPY\X=0 

Let = [Li,... ,Yx] denote the veetor eontaining the first N entries of Y and let Y^ 


(38) 


Py\x=o independent of Y. The probability of error e is upper-bounded as follows: 


e< P 


^ -ry I X=ci 


= E 

< E 
= E 


P 


|J{*(ci,y)<!(ci,y)} 

3=2 


Li=2 


min < 1, (M — 1)P t(ci, Y) < *(c 2 , Y) 


Y 


N 


min 1)P txixo] Y^) < t^ixo] Y^) \ Y^ 


(39) 

(40) 

(41) 

(42) 


Here, (39) follows beeause all codewords have the same error probability under maximum likeli¬ 


hood decoding; (41) follows by choosing the tighter bound between 1 and the union bound; (42) 
follows because Ci = [xq, ..., Xq, 0,..., 0, 0,...] and C 2 = [0,..., 0, Xq, ..., Xq, 0,...], and 

N N N N ^ 

because, under Py\x=ci, the sequence has the same distribution as Y^ ~ Py\x=o- 

Furthermore, T^v+i independent of Y^ since the channel is stationary and memory less. ■ 
On the converse side, we have the following result, which follows by applying the meta¬ 
converse theorem [7, Th. 31] with Qy = Py\x=o- 

Theorem 3: Consider a channel (A, Py\x-i 3^) that has a zero-cost input symbol. Every (P, M, e)- 
code with codewords satisfying the cost constraint (22) satisfies 


M < 


1 

W (^^-^^Py\X=x,Py\X=q) 

X-. yj b[xi\<E 


(43) 
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The bound (43) is in general not computable because it involves an optimization over infinite¬ 
dimensional codewords. As we shall see in the next section, in the MIMO Rayleigh block-fading 
case it is possible to reduce this infinite-dimensional optimization problem to a three-dimensional 
one, which can be solved numerically. 

We would like to remark that the general bounds developed in this section apply to both the 
no-CSI case and the perfect-CSIR case. For the perfect-CSIR case, we view the pair (V, H) as 
the channel output, and identify the channel law with Pv,h|u = -Ph-Pv|h,u- For the no-CSI case, 
we view V as the output and identify the channel law with Pv|u> which is obtained by averaging 
Pv|H,u over the fading matrix H. In both cases, the channel is stationary and memoryless. 


B. Nonasymptotic Bounds: the No-CSI Case 

Particularizing Theorems 1 and 2 to the channel (12) for the no-CSI case, we obtain the 
achievability bounds given below in Corollaries 4 and 5. 

Corollary 4: For every P > 0 and every 0 < e < 1, there exists an (P, M, e)-code for the 
MIMO Rayleigh block-fading channel (12) for the case of no CSI satisfying 

"■'PAP*NP[G„>(l"+£/iV)51 

where Gat ~ Gamma(mrA^, 1) and ^ satisfies 


nON <^] = e-T. 


(45) 


Proof: Every code for the memoryless SIMO Rayleigh-fading channel (mt = ric = 1) 
can be used on a MIMO Rayleigh block-fading channel with mt > 1 and ric > 1. Indeed, it 
is sufficient to switch off all transmit antennas but one, and to limit transmissions to the first 
channel use in each coherence interval. Therefore, it is sufficient to prove that (44) is achievable 
for the memoryless SIMO Rayleigh-fading channel (21). In the SIMO case, we have (see (21)) 


j _ m-r N 

urY\X=xo / ]\f\ aJologC'^'^ An n I ^ 

log —[y ) = 1 ^ 2 L^ 2 L^yr,i-m,N\og{l + XQ). 


jp(S)N 

^-^Y\X=0 


1 + a:o 

r=l 1=1 


(46) 


dP' 

Let xq = E/N for some N Then, under Py^=x ’ iFe random variable log 


.®iv 


Y\X=xo^ 


dP' 


has the same distribution as 




E 

N 


Gn loge — rrij-N \og{l -f E/N) 


(47) 
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where Gn ~ Gamma(mj.A^, 1 ), and, under Py\x=o^ same distribution as 

The proof of (44) is eoneluded by using (47) and (48) in (23) together with the Neyman-Pearson 
lemma [26], and by optimizing over iV G N. ■ 

Corollary 5: For every M > 0 and every 0 < e < 1, there exists an (i?, M, e)-eode for the 
MIMO Rayleigh bloek-fading ehannel (12) for the ease of no CSI satisfying 

e < minE 

AieN 

where and Gm are i.i.d. Gamma(mi.iV, 1 ) random variables. 

Proof: We proeeed first as in the proof of Corollary 4. Then, we use (47) and (48) in (36). 


mm 


1 , (M-l)P Gjv > (1 + E/N)Gn 


G 


N 


(49) 


Numerieal evidenee (provided in Seetion III-E) suggests that (49) is tighter than (44). How¬ 
ever, (44) is more suitable for asymptotie analyses. 

We now provide a eonverse bound, whieh is based on Theorem 3. 

Theorem 6: Let {Sj} be i.i.d. Exp(l)-distributed random variables. Every (E,M,e)-eode for 
the MIMO Rayleigh bloek-fading ehannel (12) for the ease of no CSI satisfies 


inf P 

1 X 

— > sup- 

M ~ 


Y, (xiSi loge - log(l + Xi)] < 7] 

i=l ^ ^ 


— e 


The infimum in (50) is over all x G 


exp ( 77 ) 

taking one of the following two forms: 


(50) 


X = 


[73, 72 , •••, 72 , 7i,0,0,...] 

N 


(51) 


or 


X = [ g2,-^-,72 , 7i,--^-,7i; 0,0,---]- (52) 

N2 Ni 

Here, N eN and 0 < gi < g 2 < 73 satisfy qi + Nq 2 + 73 = m^E. Furthermore, Ni, N 2 ^and 
0 < 7 i < 72 satisfy Niqi -f A ^272 = rrirE. 

Remark 2: The optimization over infinite-dimensional eodewords in the eonverse bound (43) 
is redueed in (50) to a three-dimensional optimization problem. This makes (50) numerieally 
eomputable. In words, the eonditions in (51) and (52) imply that i) the entries of x ean take 
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at most three distinet nonzero values, and that ii) if the entries of x take exaetly three distinet 
nonzero values, then both the largest and the smallest nonzero entries must appear only onee. 

Proof: Without loss of generality, we ean assume that eaeh eodeword matrix Cj satisfies 
the energy eonstraint (14) with equality. Indeed, for an arbitrary eode C, we ean eonstruet a 
new eode C by appending to eaeh eodeword matrix Cj in C an extra Uc x rrit bloek of energy 
E — ||Cj||p (reeall that the number of transmitted symbols is unlimited). The resulting eode C' 
has the same number of eodewords as C and eaeh eodeword of C' satisfies (14) with equality. 
Moreover, the error probability of C ean not exeeed that of C. 

We eontinue the proof of (50) by using Theorem 3, whieh implies 

— > inf /3i_,;(TVoo moo—uoo, Tyoo moo—o). (53) 

IVl u°°e.4: ||u°°||p=£; 

For a given U°° = {U*}, let = {Uj} where Uj G is a diagonal matrix whose diagonal 

elements are the singular values of Uj. We shall next show that 

/3i-e(TYcx> I uoo—ijoo, Jyoo I uoo_o) =/3i-e(T’yoo |u°°=o)- (54) 

This implies that to evaluate the RHS of (53), it suffiees to foeus on diagonal matriees {Uj}. 
Note also that when the input matriees {Uj} are diagonal, the mt x mr MIMO bloek-fading 
ehannel (12) deeomposes into min{mt,nc} noninteraeting memoryless SIMO fading ehannels 
with reeeive antennas. Therefore, exploiting the equivalenee between (19) and (21), we 
eonelude that the RHS of (53) eoineides with 

inf (ii-e{PY\x=x,PY\x=o) (55) 

a:eK;f ;||£c||i=£; 

where Py|x is the eonditional distribution of the output of the ehannel (21) given the input. 

To prove (54), we note that given Uj = Uj, the eolumn veetors of the output matrix V* are 
i.i.d. C7V(0, Inc + UjUf)-distributed. Therefore, the probability distribution Pvi |Ui=Ui depends on 
Uj only through UjUf. In partieular, it is invariant to right-multiplieation of Uj by an arbitrary 
mt X mt unitary matrix G. Furthermore, sinee the noise matrix Zj is isotropieally distributed [27, 
Def. 6.21], for every ric x Uc unitary matrix G and every U G the eonditional distribution 

of Vj given Uj = U eoineides with that of G^Vj given Uj = GU. Therefore, for every i G N, 
and every unitary matriees G and G, we have 

/5l-e(-PVi |Ui=U)-PVi |Ui=o) ~ I U,=GUG’-^GHVi I Ui=o) 

= I U,=GUG’ I U,=o)- (57) 
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Here, the seeond step follows beeause •) stays unchanged under the change of variables 

Vj I— G^Vj. Since G, G, and i are arbitrary, and since the channel Pv°° | is block-memoryless, 
(57) implies (54). 

Next, we lower-bound /3i-e(P¥|x=a;,-P y|x=o) in (55) using [7, Eq. (102)]. Specifically, we 
fix an arbitrary 77 G M and obtain 


/^l-e(-PY| X=a;5 -Py|X=o) 

> exp{-r])(^PY\x=xHx,Y) < 77 ] - ej (58) 


where *(•, •) was defined in (38). Under Py\x=x, the random variable z{x,Y) has the same 
distribution as 


TTlr 00 

EE( XiSr,i log e - log(l + Xi) j (59) 

r=l j=l 

where are i.i.d. Exp(l)-distributed. Substituting (59) into (58), and then (58) into (53), 

we obtain 


inf P 

mr OO / \ ' 

x; log e - log(l + Xi) ) <77 

— e 

a:gR^:||cc||l=i 7 

r=li=l ^ ^ 




exp ( 77 ) 



\ °° / \ ] 


inf P 

X; XiSi log e - log(l + Xj) <77 

— e 


.7=1 ^ ^ . 



exp ( 77 ) 


(60) 

(61) 


where {S'i} are again i.i.d. Exp(l)-distributed. Here, (61) follows because the feasible region of 
the optimization problem in (60) is contained in the feasible region of the optimization problem 
in (61). 

Lemma 7 below, which is proven in Appendix I, sheds light on the structure of the vectors x* 
that minimize the RHS of (61). 

Lemma 7: Let x* be a minimizer of 


inf P 

£cGM^:||£c||i=mr-E 


(xiSi log e - log(l + Xi) j <77 


i=l 


(62) 


Assume without loss of generality that the entries of x* are in nonincreasing order. Then, x* 
must be of the form given in (51) or in (52). 

The proof of Theorem 6 is concluded by using Lemma 7 in (61) and by maximizing the RHS 
of (61) over 77 . ■ 
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Remark 3: The proof of Lemma 7 relies on an elegant argument of Abbe, Huang, and 
Telatar [22], used in the proof of Telatar’s minimum outage probability eonjeeture for MISO 
Rayleigh-fading ehannels. Indeed, both [22] and Lemma 7 deal with the optimization of quantiles 
of a weighted convolution of exponential distributions. 


C. Asymptotic Analysis 

Evaluating the bounds in Corollary 4 and Theorem 6 in the limit E —)■ cx), we obtain the 
asymptotic closed-form expansion for M*{E,e) provided in the following theorem. 

Theorem 8: The maximum number of messages M*{E,e) that can be transmitted with en¬ 
ergy E and error probability e G (0,1/2) over the MIMO Rayleigh block-fading channel (12) 
for the case of no CSI admits the following expansion as E —)■ cx) 

logM*(E,e) =m,E\oge-Vo- (m,EQ~^{e)'j ^ (log(mrE))^^^ ^ ^ 

Here, Vo is given in (5). 

Proof: See Appendix III. ■ 

The intuition behind (63) is as follows. It is well known that in the no-CSI case, to achieve 
the asymptotic limit —1.59 dB, it is necessary to use flash signalling [2]. If all codewords satisfy 
a peak-power constraint i|a?||oo < A in addition to (14), then log M{E, e)/{nirE) converges as 
E —)■ CX) to (see [13] and [14, Eq. (59)]) 

loge — A“Mog(l-f A). (64) 


The second term in (64) can be interpreted as the penalty due to bounded peakiness, which 
vanishes as A —)■ cx). When the energy E is finite, as in our setup, it turns out that for large E 

log M(i7, e) 


rrij-E 


log e 


log(l + A) 
A 


A 

rrirE 


Q (e)loge. 


(65) 


The second term on the RHS of (65) captures the fact that codewords that satisfy (14) for a 
finite E are necessarily peak-power limited. The third term captures the penalty resulting from 
the stochastic variations of the fading and the noise processes, which cannot be averaged out 
for finite E. This penalty increases with the peak power. Coarsely speaking, peakier codewords 
result in less channel averaging. To summarize, peakiness in the codewords reduces the second 
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term on the RHS of (65) but inereases the third term. The optimal peak power A* that minimizes 
the sum of these two penalty terms turns out to be 



( 66 ) 


Substituting (66) into (65) we obtain (63). See Appendix III for a rigorous proof. 

D. The Perfect-CSIR Case 

In this seetion, we provide aehievability and eonverse bounds on M*{E,e) for the ease of 
perfeet CSIR. To state our aehievability bound, it is eonvenient to introduee the following 
eomplex AWGN ehannel 


Yi = Xi + Zi, i e N. 


(67) 


Here, {Zi} are i.i.d. 1)-distributed random variables. Theorem 9 below allows us to relate 

the performanee of optimal eodes for the AWGN ehannel (67) to the performanee of optimal 
eodes for the MIMO Rayleigh bloek-fading ehannel (12). 

Theorem 9: Consider an arbitrary (irirE, M, e)-eode for the AWGN ehannel (67). There exists 
a sequenee of {E, M, eAr)-oodes for the MIMO Rayleigh bloek-fading ehannel (12) with perfeet 
CSIR, for whieh limAr_>.oo cat < e. 

Remark 4: Theorem 9 holds also if the fading is not Rayleigh, provided that the entries 
of Mj are i.i.d. and satisfy E,[\Hij^k\‘^] = 1. 

Proof: As in the proof of Corollary 4, it is suffieient to eonsider the ease mt = ric = 1. Take 
an arbitrary {m^E, M, e)-eode for the AWGN ehannel. We assume without loss of generality that 
only the first M entries of eaeh eodeword are nonzeros. This is beeause, for the AWGN ehannel, 
the error probability under maximal likelihood deeoding depends only on the Euelidean distanee 
between eodewords, and beeause we ean embed the M eodewords in an M-dimensional spaee 
without ehanging their Euelidean distanees. 

Next, we transform the SIMO memoryless fading ehannel (with perfect CSIR) into an AWGN 
channel as follows. Eix an arbitrary 27 G N; for every codeword u = [ui,... ...] for 

the AWGN channel, we generate the following codeword ii for the memoryless SIMO fading 
channel (19) 



( 68 ) 


N 


N 
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By construction, ||w||| = ||w IliMr- For a given channel output {K-,*} (see (19)), the receiver 
performs eoherent eombining aeross the rrij. reeeive antennas and the length-iV repetition block: 


nir N 




yJm^N T ^ 
r=l 1=1 




(69) 


mr N 


rrii N 


U 


UlrN 


'' ^ r=l 2=1 


r=l 2=1 


1)^+21)^+2: 


(70) 


If we let —)■ oo, the first term in (70) converges in distribution to Uj by the law of large 
numbers, and the seeond term converges in distribution to Zj ~ CJ\f{0, 1) by the central limit 
theorem. Therefore, Vj converges in distribution to Uj + Zj. Thus, eonverges in 

distribution to an AWGN channel law = CJ\f{u^, 1^^) as iV —)■ cx). 

We next evaluate the error probability cat of the eode that we eonstrueted above. Let Vj denote 
the deeoding region for message j, 1 < j < M, and let \nt{Vj) denote the interior of Vj. It 
follows that for every 1 < j < M 


lim 1 — Cat 

N^OO 


(71) 



(72) 



(73) 


_ pAWGN r-p ] 

— 1 UM=Uj [^j\ 

(74) 


> 1-e. 

(75) 

Here, (73) follows because 

converges in distribution to PyMj^pA/_„, 

and beeause 


Int('Dj) is open; (74) follows because the boundary of the maximum likelihood deeoding re¬ 
gion Vj has zero probability measure under • ■ 

Note that the proof of Theorem 9 above requires perfect CSIR. The approach just described 
does not neeessarily work if only partial CSI is available at the reeeiver. For example, eonsider 
the following partial-CSI model [28] 

L, = (P, + H,)U, + Zi, leN (76) 

where Hi ~ CA/^(0,p), p G (0,1), Hi ~ CA/^(0,1 — p), and [Hi] and [Hi] are independent. 
We assume that the reeeiver has perfeet knowledge of [Hi], but knows only the statisties of 
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{Hi}- The random variables [Hi] and [Hi] can be viewed as the estimation of the channel 
coefficients and the estimation errors, respectively [28]. Following steps similar to the ones in 
the proof of [2, Th. 7], one can show that flash-signalling is necessary to achieve the —1.59 
dB limit. Hence, spreading the energy as it is done in the proof of Theorem 9 is not first-order 
optimal. 

For the case where perfect CSI is available at both the transmitter and the receiver, and where 
the fading distribution has infinite support (e.g., Rayleigh distribution), it is well known that the 
minimum energy per bit converges to 0 in the limit k ^ oo and e —?• 0 [2, p. 1325]. 

Using the approach used in the proof of Theorem 9, one can show that El{k, e) = 0 for every 
k and e > 0. Indeed, since both the transmitter and the receiver have perfect CSI, they can 
agree to use the channel only if the fading gain \H\‘^ is above a threshold F. By doing so, 
we have transformed the original fading channel into a channel with a fading distribution Ph 
that satisfies Ep^[\H P] > r. Proceeding as in the proof of Theorem 9, we conclude that every 
(i7,2^,e) code for the AWGN channel can be converted into an (T^/Ep^[|i7p], 2^, e) code for 
the fading channel with distribution Ph- Since F can be taken arbitrarily large, we conclude that 
the minimum energy per bit El{k,e) is 0. 

Theorem 9 implies that the asymptotic expansion (2) with E replaced by m,.E is achievable 
in the perfect-CSIR case. Theorem 10 below establishes that, for 0 < e < 1/2, the converse is 
also true. 

Theorem 10: The maximum number of messages M*{E,e) that can be transmitted with 
energy E and error probability 0 < e < 1/2 over the MIMO Rayleigh block-fading channel (12) 
for the case of perfect CSIR satisfies 

log M*{E, e) = m^E loge — loge -f ^ \og{m,rE) + 0{\/\ogE) (77) 

as E ^ oo. 

Proof: See Appendix IV. ■ 

Unlike Theorem 9, the converse part of Theorem 10 relies on the Gaussianity of the fading 
coefficients and does not necessarily hold for other fading distributions. Indeed, consider a single¬ 
input single-output (SISO) on-off fading channel Pv,h\u where the channel coefficients {Hi} are 
i.i.d. and satisfy 

P[i7, = 0]=6', ¥[\Hi\^ = l/{l-e')] = l-e' (78) 
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where 0 < e' < e. Sueh a fading distribution satisfies E[|ifjp] = 1. Set now N 
and 


M = 



1, Xo 


Ve, 

(79) 


Let 


V, h) = 


dP^ 


V,H\U=u 


dP. 


V,H\U=0 


v,h). 


(80) 


By Theorem 2, there exists an (i?, M, e")-eode for whieh the maximal probability of error e" is 
upper-bounded as follows: 


e"<E minjl, (M-l)P i{xo;V,H) < i{xo;V, H) \V, H 

< (1 - e')E Linj 1, (M - 1)P [?(xo; V, H) < *(xo; V, H) V, |i/ p = (1 - e')“^ 


< (1 - 6 ') 


e — e 
1-e' 


B 




B 


B oo. 


(81) 
+ e' (82) 

(83) 


Here, in (81) and (82), Pjjyy{h,v,v) = PH{h)Pv\H,u{v\h,Xo)Pv\H,u{v\h,0), and (83) follows 
from [8, Eqs. (33)-(40)]. For suffieiently large B, the RHS of (83) is less than e. This implies 
that M*{B,e) > M for suffieiently large B. Furthermore, by [8, Fqs. (47)-(49)], we have 

logM*(E,e) > logM > + 0{Ve), B^oo. (84) 


Clearly, the RHS of (84) is greater than the RHS of (77) (eomputed for = 1) for large B. 

In Theorem 11 below, we present a nonasymptotie eonverse bound, whieh we shall evaluate 
numerieally in Seetion III-F. 

Theorem 11: Fix r; > 6, 77 > 0, and 0 < e < 1/2. Fet xi(? 7 ) > r/ be the unique solution of 


Furthermore, let 


40F 



Q 


/ r]-xi \ 

V ) ' 


(85) 


9ri{x) = 


Q[{x-r])/\/^) , x>xi{r]), 

'i] - xi{r])\ 


X 


;Q 


, X < Xi{ri). 


( 86 ) 


xiiv) 

Every {B, M, e)-eode for the MIMO Rayleigh bloek-fading ehannel (12) for the ease of perfeet 
CSIR satisfies 


logM < T^loge — \og\grj{mT:B) — e 

Proof: See Appendix V. 


(87) 
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Figure 1. Minimum energy per bit versus number of information bits; here e = 10 ^ and rtir = 1. 

Table I 

Minimum energy E and optimal number of channel uses N* vs. number of information bits k for the case 

e= 10“^ 



Cor. 4 

Cor. 5 

Asymptotics 

k 

E/No 

N* 

E/No 

N* 

E/No 

N* 

10^ 

98 

25 

67 

18 

120 

50 

10^ 

2.6 X 10® 

39 

2.4 X 10® 

38 

2.9 X 10® 

63 

10® 

1.3 X 10® 

96 

1.3 X 10® 

96 

1.4 X 10® 

124 

10"^ 

9.6 X 10® 

304 

9.6 X 10® 

304 

9.7 X 10® 

336 

10® 

8.2 X 10"^ 

1089 

8.2 X 10"^ 

1090 

8.2 X 10"^ 

1137 


E. Numerical Results 

Fig. 1 shows^ the achievability bounds (Corollary 4 and Corollary 5) and the eonverse bound 
(Theorem 6) for the ehannel (12) for the no-CSI ease and when e = 10“^ and rur = 1. Speeifie- 
ally, the energy per bit = E/\og 2 M*{E, e) is plotted against the number of information bits 


^The numerical routines used to obtain these results are available at https://github.com/yp-mit/spectre 
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log 2 e). For the perfect-CSIR case, we plot the converse bound (Theorem 11) together 

with the achievability bound provided in [8, Eq. (15)] for the AWGN case. As proved in 
Theorem 9, this bound is also achievable in perfect-CSIR case. As expected, as the number 
of information bits increases, the minimum energy per bit converges to —1.59 dB regardless of 
whether CSIR is available or not. However, for a fixed number of information bits, it is more 
costly to communicate in the no-CSI case than in the perfect-CSIR case. For example, it takes 
2 dB more of energy to transmit 1000 information bits in the no-CSI case compared to the 
perfect-CSIR case. Additionally, to achieve an energy per bit of —1.5 dB, we need to transmit 
7 X 10^ information bits in the no-CSI case, but only 6 x 10^ bits when perfect CSIR is available. 

The codebook used in both Corollary 4 and Corollary 5 uses only one symbol of the input 
alphabet in addition to 0. In Table I we list the number of channel uses N* = E/Xq over which 
the optimal input symbol Xq is repeated, as a function of the number of information bits k. For 
comparison, we also list the number of repetitions N* ^ j predicted by the 

asymptotic analysis (see (191)). 


IV. Conclusions 

In this paper, we established nonasymptotic bounds on the minimum energy per bit E^{k, e) 
required to transmit k information bits with error probability e over a MIMO Rayleigh block¬ 
fading channel. As the number of information bits k goes to infinity, the ratio between E^{k,e) 
and the noise level converges to —1.59 dB, regardless of whether CSIR is available or not. 
However, in the nonasymptotic regime of finite k and nonzero error probability e, the minimum 
energy per bit required in the no-CSI case is larger than that in the perfect-CSIR case (see 
Fig. I). Specifically, as k ^ oo the gap to —1.59 dB is proportional to {{\ogk)/ky^^ in the 
no-CSI case, and to 1/ y/k in the perfect-CSIR case. 

The optimal signalling strategies for the two cases are different: in the no-CSI case, the 
transmitted codewords must have sufficient peakiness in order to overcome the lack of channel 
knowledge; in the perfect-CSIR case, the energy of each codeword must be spread uniformly 
over sufficiently many fading blocks in order to mitigate the stochastic variations on the received- 
signal energy caused by the fading process. 

Throughout the paper, we have focused on the scenario where the blocklength of the code 
is unlimited, i.e., the spectral efficiency is zero. From a practical perspective, generalizing our 
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analysis to the case of low but nonzero spectral efficiency is of interest. In the asymptotic regime 
k ^ oo, this can be done by approximating the spectral efficiency by an affine function of the 
energy per bit, and by characterizing the slope of the spectral efficiency versus energy per bit 
function at —1.59 dB (wideband slope) [2]. A generalization of Verdu’s wideband-slope analysis 
to the finite-/c case seems to require more sophisticated tools than the one used in the present 
paper (see [29, Sec. V.C] for some preliminary results in this direction). 

Appendix I 
Proof of Lemma 7 

The proof relies on [22]. In particular, we shall make repeated use of [22, Cor. 1 and 
Lem. 2], which are restated below for convenience. For a continuous random variable A, let /a 
denote its probability density function (pdf), and let and f'X denote the first and the second 
derivatives of Ja, respectively. Furthermore, let Si and S '2 be independent Exp(l)-distributed 
random variables, which are also independent of A. Then, for every x, qi, q 2 ^ ^ [22, Lem. 2]: 

fA+q,Sr{x) - fA+q2S2{x) = (g2 - 7l)/A+qi5i+g2S2(^)- (88) 

This identity can be readily verified by computing the Fourier transform of both sides. Setting 
g 2 = 0 in (88), we obtain [22, Cor. 1] 

fA+gisA^) - fA{x) = -qifA+qiSii^)- ( 89 ) 

The proof of Lemma 7 consists of four steps. 

1) We first restrict ourselves to the finite-dimensional setup, i.e., we assume that x G for 
some m G N. We shall derive a necessary condition a minimizer x* G must satisfy, by 
deriving the Karush-Kuhn-Tucker (KKT) optimality conditions (see, e.g., [30, Sec. 5.5.3]). 

2) Building upon these conditions, we show that the entries of x* can take at most three 
distinct nonzero values. 

3) We prove that if the entries of a;* take exactly three distinct nonzero values, then the 
maximum and the minimum nonzero value must appear only once, i.e., x* is of the 
form (51). If the entries of x* take less than three distinct nonzero values, then x* 
satisfies (52) trivially. 

4) Finally, we take m to infinity to complete the proof. 

Departing from our convention, in this appendix we shall use log to denote the natural logarithm. 
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A. The KKT conditions 
Let 

m 

ip{x, s) = ^ (^XiSi - log(l + Xi) j. (90) 

i=l 

Using (90), we can express (62) for the case x e as 


inf P[(^(a;, 5) <-nl. (91) 

a:eK™:||a:||i=£; 

By the KKT optimality conditions, if x* is a minimizer of (62), then there must exist a A G M 
such that for all fc = 1 ,..., m. 


d¥[^{x^ S) < rf] 
dxk 


X = X* 


= A, if Xfc > 0 
> A, otherwise. 


(92) 


Let Sk be an Exp(l)-distributed random variable that is independent of S. Let {x, S) = 
YlT=i let i) = r; + Yl^=i log(l + ^j)- The partial derivative in (92) can be computed 

through a Fourier analysis as in the proof of [22, Lem. 1]. This yields 


d'¥[p{x, S) < Tj] _ f{a^,S)ifl) _ r 

dxk l + Xk H^,s)+xkS^V) 


(93) 


From (93), it follows that 

dF[(p{x, S) < r]] d¥[ip{x, S) < rj\ 


dxj 


dxk 


= /< 


{x,S)+Xk&k 


iv) - /, 


{x,S)+Xjd 


iv) + 


= {Xj-Xk)l 


iv) - 


(Xfc-Xj)/(^,5)(?7) 
{1 + Xk){l + Xj) 


{x,S)+XkSk+XjSj'^ H (-X + Xk){l + Xj) 


where in the last step we used (88). 


(94) 

(95) 


B. The entries of a minimizer can take at most three distinct nonzero values 


As in [22], our proof is by contradiction. We shall assume without loss of generality that 
m > 4. Let x* be a minimizer of (91), and assume that the entries of x* take more than three 
distinct nonzero values, the smallest four of them being 0 < < X2 < X3 < X4. Then, by (92) 

and (95), 


(1 + a^p//' 


3 > j {x* ,S)+x\Si+x*Sj 


iv) 


(1 + x]) ’ 


J= 2,3,4. 


(96) 
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By (89), the left-hand side (LHS) of (96) ean be expressed as follows: 




J 77 (^x*^S)+x*Si+XjSj 


iv) = f. 


{x*,S)+xlSi+x*Sj - 


iv) + /, 


(a:*,S)-|-x*Si 


iv) - /, 


{x*,S)+xlSi+x*S. 


iv)- (97) 


Sinee the RHS of (96) does not depend on j, by substituting (97) into (96) and by taking the 
difference between the case j = 2 and the case j = 3, we obtain 


0 = /; 


{x*,S)+xlSi+x*S2 


^ 7 ) J" (x* ,S)+xlSi+x*S3^'^^ 


(^f{x*,S)+xl 


Si+x*S2 


iv) - /, 


(a:*,S>-rx*Si-rx*53 


iv) 


(^3 {x-,S)+xlSi+x*S2+xlS3^'^^ f {x*,S)+xlSi+x*S2+xlS3^'^^\ 

Here, (99) follows by ( 88 ). Set 


(98) 

(99) 


A ^ {x*, S) + xlSi + XIS 2 . (100) 

Since x^ 7 ^ xl by assumption, (99) can be rewritten as 

fl+x*S3^^)-fA+xlS3^^) = ^- ( 101 ) 

Following the same steps as in (98)-(101), we also have that 

O' < 102 ) 

Next, we show that (101) and (102) cannot hold simultaneously. This in turn implies that the 
entries of x* must take at most three distinct nonzero values. Let 


7 (^) = /;;+,5(77)-/;+,5(7)) ( 103 ) 

where S ~ Exp(l). Since (101) and (102) imply that g{xl) = g{xl) = 0, to establish a 
contradiction between ( 101 ) and ( 102 ), it suffices to show that the function g{t) has at most one 
zero on (0, 00 ). Observe that g{t) can be rewritten as 

9{t) = fA+d^) - /A+ts(7) (104) 

= (/a * fts) ( 7 ) - (/a * fts) ( 7 ) (105) 

= ^ ^ {fliv -z)- fAiv - ^)) e-'^^^dz. (106) 

Since the kernel strictly totally positive [31, p. 11] on [0, i)] x [0, cx)), it follows from [31, 

Th. 3.1(b)] that the number of zeros of g{t) on (0, 00 ) cannot exceed the number of sign changes 


23rd May 2016 


DRAFT 


28 


of z !-)■ /^(^) — fAi^) on (0, fj), provided that the latter number is finite. Thus, to prove that g(t) 
has at most one zero over (0, oo), it suffices to show that fA{z) — /^(2;) changes sign at most 
once on (0,ri). In fact, we shall prove that it changes sign at most once over an interval that 
contains (O,!;). By [22, Lem. 3], /^(2:) is continuous on M, and there exists a z > 0 such that 

/a(^) > 0 ^ ^ (0,^). Let 0 = arg max /^(2;). Since /^(O) = 0 (which follows because 

ze[o,i] 

/a(^) = 0 for all z < 0 and because is continuous) and since /^(2:) > 0 for all z € (0, z), 

we have that 0 < z < 1. This implies that 

/a(^) - fA{z) = fA{z) - [ fA{z)dz > fA{z) - zfAiz) > 0. (107) 

Jo 

By Lemma 12 in Appendix II, is strictly log-concave on (0, cxd), which implies that 2; 1— 
f'^{z)/fA{z) is strictly decreasing on (0, 00). This in turn implies that there exists a unique 
Zq > 0 such that /^(^o) “ /a('^o) = 0. It also implies that /^( 2 ;) — /a(^) > 0 if 2 : G (0, 2 : 0 ) and 
/a(^) ~ fA{z) < 0 if z > zq. We shall now prove that 

1) fA{z) — 7^(2;) changes sign at most once on (0, zq); 

2) (0,r)) C (0,^o), i-e., 

fj < Zq. (108) 

1) The function ff^z) — changes sign at most once on (0,;2o)-‘ It suffices to prove 

that ffiz) — fAiz) is unimodal on (0,2;o). This is done by induction. Recall that is the 
convolution of exponential pdfs (see (100)), i.e., A can be written as for some > 0, 

z = 1,..., m', and 2 < m' < m + 2. Let k = 1,..., m!, denote the partial sum Yl\=i 
and let Zk, k = 2 ,..., denote the solution of f's^izk) — = 0. Recall that, by the strict 

log-concavity of we have that Zk is unique and that f's^^iz) — fski^) > 0 if 2; G (0, Zk) and 

~ fski^) < 0 if z > Zk- It can be verified that /b2 — /b 2 is unimodal on (0, Z 2 ). Assume 
now that is unimodal on (0, Zk) for some k > 2. We next show that — /sk+i is 

unimodal on (0 ,^a:+i)- Note that 

/k+l “ f^k+l = (/k “ /sj * fa^+iSk+i- (109) 

Since and fa^:+A+i smooth and strictly positive on (0, Zk), it follows that (/^^^^ - 

fBk+i)i^k) > 0. This implies that 2:^+1 > Zk- Since is positive and unimodal on 

(0, Zk), and since fa^+iS^+i is log-concave, it follows that /k+i “/b^+i is positive and unimodal 
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on (0, Zfc) [32]. Furthermore, the strict log-concavity of and the definitions of Zk and Zk+i 
imply that, for every 2 ; e [ 2 ;^, 2 :^+ 1 ), 


(/sfe+i /k+i)(^) 


Ofc+l 


/k(^) -( /k+i(^) )) < 0 


( 110 ) 


<0 >0 

The first step follows by applying (89) twice. The inequality (110) implies that /k+i ~ fs^+i 
unimodal on (0,^fc+i). Hence, by induction, — /a is unimodal on (0,2;o). 

2) Proof of (108).’ It follows from (96) that ffiv) > 0’ which implies by [22, Lem. 3] that 
/^(f) > 0 for all t G (0,r)). Therefore, we have 


/A+.*S3(k=(/A*k‘S3)(k>0. 

By the strict log-concavity of fA+x*si') 


Moreover, by (89) and (101), 


( 111 ) 


( 112 ) 


fAiv) - fA+x^,.%iv) = fAiv) - /a+x*S3('^)- (113) 

Using (112) in (113), we conclude that 


fAiv) - fAiv) > 0 


(114) 


This implies (108). 


C. The minimum and maximum nonzero values must each appear only once 

We focus on the case when the entries of x* take exactly three distinct nonzero values. Assume 
without loss of generality that x* has the following form 


X = X 


* * * * * 

Xi, . . . , Xi, X2,..., X2, Xg, . . . , Xg, 0 , . . . , OJ 


Nl 


N 2 


N 3 


(115) 


where 4- XIN 2 + x^N^ = E, 0 < xl < x^ < x%, and N 2 , > 0. We shall prove that 

if A^i > 1, then 


d^F[pix},S) < p] 
dS^ 


(116) 
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where x*^ = x* -[- 5ei — 5e2. Sinee this eontradiets the assumption that x* is a minimizer, we 
eonelude that = 1. Using a similar argument, one ean show that = 1. 

We first eompute the LHS of (116). Assume Ni > 1, so that [a;*]i = [x *]2 = x\. Set 

m 

= 7] + Y1, log(l + Proeeeding similarly as in the proof of (93), we obtain 

dF[(p{x}, S) < T]] 

86 


i=l 


- (/{a:|,S> + (xJ+5)Si(^) f {xs,S) + ixl-5)S2^V5)j 

f{xg,s){vs) f{xg,s){vs) 

, 2(5/(^.,5)(r)5) 

^^•f(®*,S> + (xJ+5)A+K-5)52^^U ^ _ ^2 • 


(117) 


(118) 


Here, (118) follows from ( 88 ). Taking the derivative of the RHS of (118) with respeet to 6 and 

m 

then setting 5 = 0 , we obtain (reeall that fj = rj + Y^ log(l + [a^*]i)) 


2=1 


d^F[ip{x}, S) < r]] 


86^ 


5=0 


= 2 I f' - - ihi - 

' •’ {x*,S)+xlSi+xlS2^ '' _|_ Xi)^ 


From the KKT eondition (96), we know that 


w f{x*,s){v) 

J{x*,s)+xiSi+x*jy I’ ( 1 + + 


= 0 


(119) 


( 120 ) 


where S ~ Exp(l) is independent of all other random variables. Let T = Si + § 2 . Subtraeting 

the LHS of (120) from (119), we obtain 

1 8‘^F[ip{x\, S) < rj\ 

5=0 

f{x*,s){v) 


= [X 


86^ 

2-x*i){f 


{x*,S)+xIT+X2 


Iv) 


(^2 ^l)y {x*S)+xlT+x*S^'^^ 


(1 + Xi)2(l + X 2 ) 

f " -(fi) 

•> {x*S)+xlSi+xlS^ '> 


Xo 


x; 


1 + a;^ 


f" 

J {x-,S)+xlT+x*S 


1 + xl 

iv) ~ f/x*.S\4-x*T4-x*s(^^ ) • 


' {x*,S)+x’lT+x*S^ 

Here, in (121) we used ( 88 ); (122) follows from (120); and in (123) we used (89). 
We shall next make (123) depend on 0 : 3 . Note first that by (89), 

?(^) ~ f{x*,S)+xlT+x'^S^'^'^ 


( 121 ) 

( 122 ) 

(123) 


f” 

J {x*,S)+x’lT+x*S^ 


^{x*,S)+xIT+x*S+x*S3^'^'^ ^{x*,S)+xIT+x*S+x*S3^''^^ 


= Xq 


3 y { x *S)+ xIT + x *S+ x *S3 


iv) - f\ 


{a;*S}+a:*T+a:2S+X3.S3 


i.n) 


(124) 
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Also by (89) and (101), 


^ {x*,S)+x*T+X2S+x^S 3 iv) - f[ a:*,S)+a;jr+X2 5+a:3S3 (^) 


1 


(•^{£c*S)+x*T+a;*S+x*S3^^) •^(x*S}+x*T+x*S+x*S3^'^^^ 

Combining (124) and (125), we conclude that 


^ {x*,S)+xlT+x*S^'^'^ ^ {x* ,S)+xlT+x*S^'^'^ 


•^ 3 -^1 / X/ 

X* y{x*s)+xi 


T+xtS+x^s 


iv) /(a;*S>+a;*T+x*:S+x*53^^)) ■ 


Substituting (126) in (123), we obtain 

1 S) < rj\ 


(5=0 


2 952 

^ {x *2 - xl){xl - xl) ( 

X*, 


-a:^)(xg -x^) / , 

:^(l + xt) y{^*,s)+xi 


T+xtS+xtS3 


iv) 


^{x*,S)+xIT+x*S+x*S3^^') )■ 


Since (x^ — Xi)(x 3 — xl) > 0, to establish (116), it remains to prove that 

^{x*,S)+xlT+x*S+x*.%i^) ~ ^{x*S)+xIT+x*S+x*S3^'^^ ^ 

Let A = {x*, S) + x^S + Xg^s- The LHS of (128) can be rewritten as 

fl+xiri^) - fx+xiri^) 


Ja+x^Si ^A+xlSi) * ^xlS2iv) 


Since A + X 3 .S 3 ~ A + X 3 .S 1 , by (101), 


fl+xlSr^^) - fA+xlS 3 ^fj) = 0 - 


Note that, for every f e ( 0 , 77 ), we have 


fys,w-fys,m= (/i(^) -/i(^))e-<->«& 

0 


=Ht) 


(125) 


(126) 


(127) 


(128) 


(129) 

(130) 

(131) 

(132) 

(133) 
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In Appendix I-Bl, we have shown that the funetion f'~ — f'~ ehanges sign at most onee over 
the interval (O,?)). Therefore, h'{t) = — /~(t))e*/^i ehanges sign at most onee over the 

interval (0,7]). But sinee h{fj) = _ n (fj)) = 0 = /i(0), the funetion h 

does not ehange sign on (O,?)). Indeed, there are three possible eases: 

1) h'{t) = 0 for all t G (0, fj); in this ease h{t) = 0 for all t G (0, fj). 

2 ) there exists a to ^ ( 0 , r]) sueh that h'{t) < 0 on ( 0 , to), h'{to) = 0 , and h'{t) > 0 on {to, 77 ); 
in this ease h{t) <0 for all t G (0, 77 ). 

3) there exists a fo ^ (0, 77 ) sueh that h'{t) > 0 on (0, to), h'{to) = 0, and h'{t) < 0 on {to, fj); 
in this ease h{t) > 0 for all t G ( 0 , 77 ). 

In all three seenarios, h{t) does not ehange sign on (0,77). This implies that /- — /t .a 

does not ehange sign on ( 0 ,77) either. Furthermore, 

pfj 


fA+xlsS^^ 


- f'L . I 
J A+xlSi 


z)dz 


= fA+x*sSv) - fl+xtsS^) + 


2 l+a:*Si 


' A+x*Si 


( 0 ) < 0 . 


(134) 


=0 


Here, the first step follows beeause (0) = (0) = 0 [22, Lem. 3], and the seeond 

step follows from (112). We establish (128) by using the following ehain of inequalities 


fx+xirid) - 

- I (135) 

< 0. (136) 

Here, (135) follows from (130) and beeause > 0 for all z G (0, 77 ); (136) follows 

from (134). 


D. Extension to 

Consider the following ehain of equalities: 

00 

- log(l + Xi)^ < 77 

= lim inf P 

m^oo ajGR^:||a;||i=mrE 


m 

- iog(i+Xi) j < 77 

i=\ 


inf 

:\\x\\\=mrE 


p 


(137) 
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= lim inf P 

m^oo 



log(l + Xi)^ < rj 


(138) 



log(l + Xi)^ < rj 


(139) 


_ 2 = 1 

Here, both (137) and (139) follow from the monotone eonvergenee theorem [25, Th. 2.14]; the 
infimum in (138) and (139) is over all x (in and respeetively) of the form (51) or (52). 
This eoneludes the proof of Lemma 7. 


Appendix II 

Convolution of Exponential Distributions 

In this appendix, we summarize some results about the eonvolution of exponential distributions 
that are needed in Appendices I, III, and IV. 

The first lemma deals with the log-concavity of the convolution of exponential distributions. 
Recall that a function / is called log-concave if log / is concave, and it is called strictly log- 
concave if log / is strictly concave. Since the exponential distribution is log-concave, and log- 
concavity is preserved under convolution [32], it follows that the convolution of exponential 
distributions is also log-concave. Lemma 12 below shows that this distribution is in fact strictly 
log-concave. 

Lemma 12: Fix an integer m > 2. Let Si,..., Sm be i.i.d. Exp(1)-distributed random vari¬ 
ables, and let oi,..., Om be positive real numbers. Furthermore, let B = Then, the 

pdf Jb of B is strictly log-concave on (0, oo). 

Proof: The proof is based on induction. Through algebraic manipulations, it can be verified 
that /aiSi+aiSa Strictly log-concave on (0, cx)) for every 01,02 > 0. Suppose now that the pdf 
of = Y^\=i is strictly log-concave for some k >2. We have 

fSk+iit) = [ - s)fa,^,s^^,{s) ds, t>0. (140) 

It follows that the integrand g{s, t) in (140) is (jointly) log-concave in (s, t) on and it is strictly 
log-concave on the subspace {(s, t) e : s <t}. Note that by the Prekopa Theorem [33], [34, 
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Sec. 3] for each a,b > 0, 



1/2 


(141) 


(142) 


(143) 


This implies that fsk+i Is log-concave. Following the proof of the Prekopa Theorem in [34, 
Sec. 3], and using that the function (s,f) —s)/afc+iSfc+i(s) is strictly positive, smooth [22, 

Lem. 3], and strictly log-concave for 0 < s < f, we can verify that the inequality in (142) is 
strict for every a, 6 > 0. This in turn implies that fs^+i is strictly log-concave on (0, cxd). By 


induction, fs is strictly log-concave on (0, oo) for every m >2. 


The next lemma characterizes the optimal convex combination of exponential random variables 
that minimizes the probability that such combination does not exceed a given threshold. 

Lemma 13: Let n G N, let Si,..., Sn be i.i.d. Exp(l)-distributed random variables, and let 
An = [x & : ||a;||i < l,Xi > X 2 > ■ ■ ■ > x„}. Then, for every t G M+, there exists a 

/c G {1,..., 77.} such that 



(144) 


k 


In particular, if f G (0,1], then 



(145) 


Proof: The equality (144) follows directly from [22, p. 2597]. To prove (145), it is sufficient 
to show that for every fc G N and every t G (0,1], the following inequality holds: 


k 


k-\-l 



(146) 
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Let fk{x) ^ x^e Consider the following ehain of (in)equalities 


P 


J2s^<tk 


i=l 


p 


k-\-l 




i=l 


rtk 


fk-l{x) 


f*t{k-\-l) 


dx 


1 


0 (k-iy. 

kfk-i{x)dx - 

l>t{k+l) 


fk{x) 

k\ 


dx 


fk{x)dx 


= ^ I fkitk) - 


fk{x)dx 


’ tk 


- k\ 

fk{tk) 1-te 


rt{k+i) , f'(tk) 

exp ( log f(tk) + {x — tk) log e) dx 


tk 
1-t 


fk{tk) 


k\ 


1-t 


> 0 . 


(147) 

(148) 

(149) 

(150) 

(151) 

(152) 


Here, (147) follows beeause the random variable 1® chi-squared distributed with pdf 

i=l 

fk-i{x)/{k — 1)!; in (149) we used integration by parts; (150) follows because fk{x) is log- 
concave, which implies that for every x > 0 


log/fc(x) < \ogfkitk) + Y^^ix - tk) loge; 

Jk[tk) 


(153) 


finally, (152) follows because t h-)■ is monotonically increasing on (0,1], and because 

= 1. This proves (146). ■ 

The following lemma provides a uniform lower bound on the cdf of the weighted sum of 
exponential distributions. 

Lemma 14: Let {S'*} be i.i.d. Exp(1)-distributed random variables. Let x = [xi, X 2 ,...] G 
satisfy 0 < ||a3||i < cx). Furthermore, let 

1 “ 


L{x)^ 




y^^xjSj 


a? 1 


i=l 


1 

2 +^ 


and denote the cdf of L{x) by Fh.) (t). Then, for every t e (—cx),0], 

FL(x){t) > 

Equivalently, 

Fly)\ 


(e) < e - for all 0 < e < ^. 


(154) 


(155) 


(156) 
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Proof: Since ||a;||i > 0, we ean assume without loss of generality that Xi > 0. Let 
denote the veetor that eontains the first n entries of x, let 

1 


LJx) ^ 


Xn 2 


xiSi - 


1 


2 = 1 


(157) 


and let denote the edf of Ln{x). Through algebraie manipulations, it ean be shown 

that -FL„(a;)(f) eonverges pointwise to as n —>■ cxd. Henee, to prove (155), it suffiees to 

show that for every n G N and every t G (—oo, 0] 


^ Ln(x) 


it)> 


1 

2 +^ 


+ 


We first show that (158) holds when t = 0. Indeed, we have that 


- P 




2=1 


> 


= P 


inf P 

::||3/||i = ||a;„||i 


n 


-ii 


^ ^ ViSj "Fi II111 

i=l 

111 ^ ^ Si ^ 11 aj 


— Il-^nlll 


i=l 


1 

> 

2 


(158) 

(159) 

(160) 

(161) 

(162) 


Here, (161) follows from (145); (162) follows beeause Si is ehi-squared distributed, and 
beeause the median of a ehi-squared distribution is smaller than its mean [24, Ch. 17]. 

We next prove (158) for the ease f < 0. By definition, Ln{x) has zero mean and unit varianee. 
Moreover, by Lemma 12 the pdf of Ln{x) is log-eoneave. Henee, we have that [35, 

Lem. 5.5] [36, Prop. 2.1] 


SUp/z,„(a:)(f) < 1. (163) 

t&M 


Then, the bound (158) holds beeause 


FLr.ix)it)= fLr.ix){y)dy 


fL„(x){y)dy. > 2 + ^ 


<1 


(164) 


>1/2 

The last step follows from (162) and (163). ■ 

Consider the random variable obtained by summing finitely many independent but not neees- 
sarily identieally distributed exponential random variables. The next lemma establishes that the 
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derivative of the pdf of the resulting random variable, computed at the mean value, is negative. 
Since the convolution of exponential distributions is unimodal, this implies that the mode of this 
random variable is smaller than its mean, i.e., its probability distribution is right skewed. 

Lemma 15: Let m G N, let oi, ..., be positive real numbers, and let Si, ... ,Sm be i.i.d. 
Exp(1)-distributed random variables. Furthermore, let fi = ®min — niinj{ai}, Umax — 

maxj{ai}, and A = Y^^=i Then, 

— (165) 

^max 

where denotes the derivative of the pdf of A. Moreover, the first inequality in (165) holds 
with equality if and only if ai = ■■■ = a^- 

Proof: Note that the {S’*}, i = have the same distribution as {Xf -f 

i = where {Xi}, i = l,..., 2 m, are i.i.d. 7V(0, l/ 2 )-distributed. Let Om+i = CLi, 

i = 1,... ,m. Then, A has the same distribution as 

2 m 

A = ^aiXl (166) 

i=l 

Next, we prove that < 0 by using [18, Lem. 22], which provides expressions for the pdf 
and the derivative of the pdf of functions of random variables. We first give some definitions. 


Let X = [Xi,. 

•, X 2 m\, and let fx denote the joint pdf of Xi,... 

: ^2m* Let 


be defined as 


2m 






(167) 

Let V(^ and be the gradient and Laplacian of p), namely. 




V(p{x) = 



(168) 

and 






2m 02 

Ap>{x) 

i=l * 


(169) 


Finally, let </9“^(/i) denote the preimage {x e : <.p{x) = /i}, and let dS* be the surface area 
form on 99“^(/x), chosen so that dS'(V(^) > 0. Note that fx is smooth and that the set 
is bounded. Moreover, for every x G 


2m 

l|V</5(a;)||2 = > 4/r inin {oi} > 0. (170) 

f ^ r=\ . m. 
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Then, by [18, Eq. (407)], 

where [18, Eq. (422)] 


/;-(/^) = 


lip i{p) 






llVv^lh 


A (V/x, V(^) + /x ■ A(^ fxM^^Wl 


i|V(^||i 


llVy^ll^ 


The first term on the RHS of (172) is equal to zero. Indeed, 

(V/x,V(/p) + /x-A(^ 

2m 

= ^ -2xifx ■ i2aiXi) + fx 2ai 


2m 


i=l 


i=l 


2 m 


= 4/x 

\ i=i i=i 

= 0 . 

Here, the last step follows beeause for every x G ^ H/^) 

2 m m 

^ ttixl = /i = ^ 


CLi. 


i=l 


i=l 


The second term on the RHS of (172) can be computed as follows: 

/x(V||V(^||i,V(^) , 

- u - “ - 




>/. 


(ES4a|a;?)^ 

E 2 m 9 9 

i=\ 

^min fX 

^max 


( 171 ) 


(172) 


(173) 

(174) 

(175) 

(176) 

(177) 

(178) 

(179) 


Note that the inequality on the RHS of (178) holds with equality if and only if oi = ■ ■ ■ = am- 
Einally, using (175) and (179) in (171) we conclude that 

*^min fx ds *^min 


fW < - 


(180) 


9-1 (/.i)®max llVv^ll 2 ®max 

Here, the last step follows from [18, Eem. 22]. The second inequality in (165) follows because 
/^(/i) > 0 (see [22, Eem. 3]). ■ 
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Appendix III 
Proof of Theorem 8 

A. Achievability 

To prove that (63) is achievable, we start from the inequality 

M - 1 > 


5 

Pl-e+T(-fY|X=ci, -fY|X=0) 

which is equivalent to Theorem 1 (see (25) for a definition of Ci). First, we upper-bound 

/5i-e+T(-P¥|x=ci,-P y|x=o) Eq. (103)] 

/^l-e-|-T(-PY|X=ci,-P y|X=o) < ^ ^ 


where ^ > 0 satisfies 

Py I x=ci [<ci, Y) < log ^] = e - r 

and ?(•, •) was defined in (38). The LHS of (183) can be lower-bounded as follows: 

-Py|X=ci[*X,y(Ci, Y) < log^] 

= p 


TUyN 


-m^iV log[l + ^) + < log^ 


2 = 1 


> Q 


log'C + mrA^log(l -f E/N) — rUrE log e\ const 


(182) 

(183) 

(184) 

(185) 

(186) 


Vm(]V ■ (E/A^) loge J ^/N 

Here, const denotes a positive constant® independent of E and N, (185) follows from (47), 
and (186) follows from the Berry-Esseen Theorem (see, e.g., [23, Ch. XVL5]). 

Next, we set r = l/y/E in (183) and consider the asymptotic regime E ^ oo. We shall 
choose N as a function of E so that iV —)■ cx) as —)■ oo with N/E —0. Substituting (186) 
into (183), and solving for log^ we obtain 


, ^ ,^1 I . E\ JWTt-E loge 1 const\ 

log? > m^Bloge - m,]Vlog| 1 + -j- yf—" Tf + 7Fj 

I E\ ^Jrr^:E log e 1 const \ E\ 

= m,E log e - m,iV log (1 + - ) - ^ ^ Q" ^ ((yf + 7^ ) j 


= nirE log e — nirN log 1 ^ jy 1 “ 


NJ y/N 
E \ y/ri\E log e 




E 


(187) 

(188) 
(189) 


“Throughout the remainder of the paper, we will use const to denote an arbitrary constant whose exact value is irrelevant for 
the analysis. Its value may change at each appearance. 
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Here, (188) follows by operating a Taylor-expansion of around e, and (189) follows 

beeause N/E —)■ 0. 

We next maximize the dominant terms on the RHS of (189) by ehoosing 


N = N* = arg min <( m^N log ( 1 H-1 + 

Nm \ \N 

After some algebraie eomputations, we obtain 


E\ ^/ri\E log e 


Vn 


Q- 


M* = — x (m,E)loge Y^^ f E^/HoglogE 

rrir \ 2 ^ log(mr-E) / V (log-E)®/^ 


(190) 


(191) 


Substituting (191) into (189), then (189) into (182), and finally (182) into (32) we eonelude that 

logM*(E,e) > m^Sloge - I/q ■ ' (log(mi.^))^^^ ^ ^ ( ^(log^E)S^ ) 

where Vq is given in (5). 


B. Converse 

It follows from (61) that for every r] G 


log M* {E,e) <rj — log 


inf P 


^ (xjS'i loge - log(l + Xi) j <ri 


2 = 1 


(193) 


where the infimum is taken over all x that are of the form speeified in (51) and (52). 

Before proeeeding to further bound (193), we introduee some notation. To every x G 
satisfying ||a;||i = rUrE, we assign the random variable 

1 


L{x)^ 


\^\\2 


'^XiSi - rrirE 


2 = 1 


(194) 


Let E^x){i) be the edf of L{x). By eonstruetion, L{x) has zero mean and unit varianee. Let 
rjE{x) : M+ —)• M be defined as follows: 

OO 

t)e{x) = m^Eloge - ^log(l + Xi) + ||a ;||2 log e. (195) 

2=1 

We shall choose rj so that 

T] = r]E ^ sy^VVE{x) (196) 

X 

where the supremum is again over all x that are of the form specified in (51) and (52). 
Substituting (196) into (193), we obtain 


logM*{E,e) < sup fjE^x) + O(logE). 


(197) 
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To conclude the proof, it remains to show that for every x that is of the form specified in (51) 
and (52) 

fiE{x) < m,E log e - Vo ■ (rnrEQ~^{e)^ ' (log(mr^)) + O 

To this end, we eonsider the following three cases separately. 

1) The vector x takes the form (52), and N 2 > E^l^\ 

2) The veetor x takes the form (51); 

3) The vector x takes the form (52), and N 2 < E^!^. 

Case 1: By assumption, x has at most two distinet nonzero entries ^ < qi < q 2 , and N 2 > 
Suppose that we ean approximate + E~^/‘^') by —Q~^{e) in the limit E ^ 00 

(in a sense we shall make preeise later on). The proof is then eoneluded by using the result in 
Lemma 16 below, together with (190) and (191), in (197). 

Lemma 16: For every positive constant a, we have that 


inf log(l + Xi) + a||ai ||2 

11 ^ 11 1 —-yvi T7' 


ajEMJf :||a?|| i=mr£^ 


i=l 


. ^ mrE\ rrirE 

= mm A/ log I H-| + a- 


iVeN 


(199) 


N ) ' y/N' 

Proof: See Appendix III-C. ■ 

It remains to show that we ean indeed approximate Ef^^^{e + E~^/'^') by —Q~^{e). Sinee L{x) 
is the normalized sum of Ni + N 2 independent random variables, and A ^2 —^ C )0 as i? —)• cx), it 
is natural to use the central-limit theorem to establish this result. More preeisely, we apply the 
Berry-Esseen Theorem [23, Ch. XVI.5] to El(x){') and obtain that, for an arbitrary G M, 

Ni N 2 

qi^S, + q 2 ^ 5 .^^^ - m,E \ < ^ 


FlUO = P 


2 


i=l 


i=l 


( 200 ) 


> Q (-0 - 


const (2Vi(gi)^ + 272 ( 72 )' 


N,(qf^ + N2(q2)^^ 

The seeond term on the RHS of (201) ean be evaluated as follows 


3/2 


( 201 ) 


iVi(7i)^ + iV2(72)' 
iVi(gi)2 + iV2(72) 


3/2 


< 


72 


iVi(gi)2 + iV2(72) 


1/2 


<N 2 


- 1/2 


< E-F^f 


( 202 ) 

(203) 

(204) 
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Here, (202) follows beeause (gi)^ < {qif‘q 2 , and in (204) we used that N 2 > E^. Using (204) 
in (201), seleeting ^ sueh that the LHS of (200) equals e + and using that the funetion 

Q{-) is monotonieally deereasing, we eonelude that 


^l(L) (e + = e < -Q~^ (e + E-^/^ + const ■ 

= -Q-\e) + 0{E-^/^^) . 


(205) 


(206) 


Here, (206) follows by applying Taylor’s theorem to around e. 

Case 2: By assumption, x eontains three distinet nonzero entries 0 < gi < ^2 < and qi 
and gs eaeh appear only once. For this case, we shall use a different approach from that used 
in Case 1. The main differences between the two cases are as follows: 

• In order to use the central limit theorem, we need to show that the x that maximizes fiE{x) 
contains sufficiently many nonzero entries, and that the available energy m^E is spread 
evenly over these nonzero entries as U —)■ cx). These properties are satisfied in Case 1 by 
definition. In Case 2, however, we need to verify that they hold. 

• Intuitively, since gi and ga appear only once in x, we expect that they do not contribute 
to the dominant terms in (197). As a result, we can approximate the second and the third 
term on the RHS of (197) directly without using Lemma 16. 

We proceed now with the proof. The idea is to upper-bound (197) using (156) (Lemma 14 in 
Appendix II), and then compare the resulting bound with the achievability result (192). Since 
0 < e < 1/2, and since we are interested in the asymptotic regime U —)■ 00 , we can assume 
without loss of generality that e + E < 1/2. Applying (156) to (195), we obtain 


00 



^1/2 -e-U ||a ;||2 loge + C>(logU) 


(207) 


Since we are interested in upper-bounding sup^ffEix), we focus without loss of generality on 
the X for which r]E{x) is greater than the RHS of (192). By comparing (207) with (192), we 
conclude that such x must satisfy 



(208) 
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and 

OO 

5^1og(l + Xi) < Vo ■ {m,EQ-\e)f\log{m,E)Y^^ + o{E^/^) . (209) 

We next refine the bounds (208) and (209) by exploiting that x is of the form speeified in (51). 
By (208) and (51), we have the following estimates 

gs = lla^iloo < lla^lh < (9(E2/3(logE)'/') (210) 


and 


A^ + 2 > 



2 

1 

2 

2 


> const ■ E '^^^(log E) 


( 211 ) 

( 212 ) 


Here, (211) follows beeause x has A^ + 2 nonzero entries and beeause ||a 111 < y/NT2\\ a \\2 for 
every {N + 2)-dimensional real veetor a; in (212) we used (208) and that ||a;||i = irirE. Sinee 
qi + q 2 N + qo = m^E, it follows from (212) that 

gi < g 2 < ^ < [log eY^^) . (213) 

The bound (209) implies that 


^0 ■ {m,EQ-\e)Y^\log{m,E)Y^^ + o{E^/^) 

OO 

> X]log(l + Xi) 


2 = 1 


> 7Vlog(l + q 2 ) 

m,E -qi-qs 


= A^log^l + 

> log I 1 + 


N 

rfir-E — O 


(^E2/3(logE) 


1/3 


N 


Here, in (217) we used (210) and (213). Solving (217) for N, we obtain 

N < const ■ log E) 


(214) 

(215) 

(216) 

(217) 

(218) 
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Using (212) and (218) back in (217) we obtain 

OO 

log(l + Xi) 

m-rE 


2=1 

> 


TV log (^1 + ^ j + iV log (^1 - O (e-V 3 (log E )^ 


= TVlog(^l ++o(i7'/3(logi7)-'/') 


(219) 

( 220 ) 


Here, the last step follows by Taylor-expanding the log funetion in the second term on the RHS 
of (219) around 1. 

We are now ready to provide a refined estimate for the term F^^ie + ||a ;||2 on the 

RHS of (197). Let 


/ A 
X = 


rrirE 


rrirE 


iV + 2’ ’iV + 2 


, 0 ,... 


( 221 ) 


—V— 

7V+2 


By Lemma 13 (see Appendix II) and by (194), the following inequality holds for every 7 G 

(0, rrij-E]: 

N+l 


F, 


L(x) 


7 — m^E 

ll^lh 


= P giS*! + Y 72 F + q3SN+2 < 7 

( 222 ) 

N+2 

> P > F < 7 

“ Lf + 2 ^ J 

i=l 

(223) 


(224) 


Sinee e + E < 1/2 and sinee, by Lemma 14 (see Appendix II), F,( 3 ,')( 0 ) > 1/2, we have 
+ F-i/2) < 0. Set 7 = + FY){^ + E-^^^)\\x '\\2 < m,E. Then, by (224), 

^’i(L)(' + + £;-'-'^)||V||,, (225) 

Applying the Berry-Esseen eentral-limit theorem similarly as in (200)-(206), we obtain 

Furthermore, 

,, rrirE 


^ 2 = 


yivT2 


^/N ( (iV 


(227) 

(228) 
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Substituting (226) and (228) into (225) and using (212) and (218), we obtain 

+ o{E'l^\ogEf'^) . (229) 

Finally, substituting (220) and (229) into (197), we eonelude that 

Ve{x) < mr-Eloge — A^log 

- + o(b‘'=( logs)-''’) . (230) 

The proof is eompleted by maximizing the RHS of (230) over N G N and by using (190) 
and (191). 

Case 3: By assumption, x has at most two different nonzero entries 0 < gi < q 2 , and 
N 2 < . Sinee the multiplieity of q 2 in x is less than , it ean be shown that all entries 

of X that are equal to q 2 do not eontribute to the dominant terms in (197). The analysis follows 
steps similar to the ones for Case 2. 

C. Proof of Lemma 16 
Let 

00 

/a(a;) = ^log(l + Xj) + a||a ;||2 (231) 

i=l 

with Xi standing for the iih. entry of x, and let a;* be a minimizer of 

inf Ux). (232) 

In order to prove Lemma 16, it suffiees to show that all nonzero entries of x* must take the 
same value. This is proved by eontradietion. 

Assume that there exist indiees i, j for whieh 0 < x* < x*. Let b = x*+x*, c = 
and d ^ TYk^i,k^j log(l + Xfc). Consider now the funetion / : [0, 6] —)■ M defined as 

f{t) = log(l + t) + log(l + b-t) 

+ a\/ c + + {b — ty + d. (233) 

Note that f{t) is symmetrie around t = 6/2, and that f{x*) = f{x*) = la{x*). 

Standard computations reveal that the minimum of /(f) over [0, 6/2] is achieved at one of the 
boundary points, i.e., 

/(f) > min{/(0), /(6/2)}, for all f G (0, 6/2). (234) 
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Let Xi (resp. X 2 ) be the veetor obtained from x* by replaeing the ith and jth entries with 0 
(resp. 6/2) and 6 (resp. 6/2), respeetively. Clearly, ||a;i||i = ||aJ 2 ||i = m^E. Then, (234) implies 
that 

la{x*) > min{/„(a;i),/„(a;2)}- (235) 

This eontradiets the assumption that x* is a minimizer. Therefore, the entries of x* eannot take 
more than one distinct nonzero values. 


Appendix IV 
Proof of Theorem 10 


The achievability of (77) follows from Theorem 9 and [8, Th. 3]. Next, we prove a converse. 
As in the proof of Theorem 6, we assume without loss of generality that each codeword for 
the channel (12) satisfies the equal-energy constraint 

||U~||f = E. (236) 


Let Pv”H°°|u°° — By the meta-converse theorem [7, Th. 31] applied with 

= Tyoojjoo I ujoo—Q, wc obtatn 


> 


inf 






yoojjoo H[JOO^(JCX>, 


OOTUTCXD 


(237) 


Proceeding similarly to the proof of Theorem 6, we observe that the RHS of (237) does not 
change if we focus on diagonal input matrices. This implies that for the purpose of evaluat¬ 
ing (237), the MIMO Rayleigh block-fading channel (12) is equivalent to the memoryless SIMO 
Rayleigh-fading channel (19). Let now u and (V, H) denote the input and the output of this 
SIMO channel, respectively. Then, the RHS of (237) is equal to 


inf /5i-e(PyH| Qvh) (238) 

iteC°°:||u|||=E 

where PYm\u=u denotes the conditional probability distribution of the output of the channel (19) 
given the input, and QyH = PyH|u=o- Substituting (238) into (237), and using the lower bound [7, 
Eq. (102)], we obtain that for every t] > 0, 


log M* {E,e) < rj log e — log 


log 

inf PvH|t/=tt 

, dPYm\u=u , 

log ' (V,e)<r7loge 

— e 


ueC°^:\\u\\l=E 

dQYu 



(239) 
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Under PYm\u=u, the random variable log 


dPw 


:\ u=u 




(V, H) in (239) has the same distribution as 


mr oo 


log e EE EE 


rrir oo 


1/2 


(240) 


r=l 2=1 ^ r=l 2=1 

where Z ~ A/'(0,1). 

Let now = e + ci \/E~^ \ogE, where ci > 0 is an arbitrary eonstant. Since, by assumption, 
0 < e < 1/2, and since we are interested in the asymptotic behavior of log M*{E, e) as E ^ oo, 
we can assume without loss of generality that eE < 1/2. Set 


7] = rrij-E — \j2mj:EQ ^(e^;). 


(241) 


Then, we can rewrite the minimization problem on the RHS of (239) using (240) and (241) as 
follows 


-PvH|t/=i 

uGC°°: 


11 ^112=-® 


= inf E 

ueC°°: 

\\u\\l=E 


= q{E). 


dPYR\U=u ^ , 

log ——-(V, e) < r/ log e 

WVVH 


/ m-r oo _ \ 

\uiHr,i\^ - rUj-E + y/2m,EQ-^{eE) ' 




r=l i=l 


2 E E 

r=l i=l 


We next show that q{E) admits the following large-i? expansion: 


q(E) = inf P 

iieC“: 


||it|||=l L 


E 

2mr 


- mA <Z-Q ^(e^) 


r=l i=l 


+ o\ 


\ogE 


E 


(242) 

(243) 

(244) 


The key step is to replace the term W2 E E the denominator on the RHS of (242) 

V r=l i=l 

by y/2rT\E. To this end, consider the function t h-)■ Q{{t — r/)/-\/E) with r] given in (241). If 
t < T] — 2^/mrE\ogE, we have 

' t — Tj 


1 > Q 
> Q 


t — T] 


A2m^E ^ 

^ / 2^Jm,E\ogE\ 

-^V 72^ ) 

= 1-0{E-^) . 


(245) 

(246) 

(247) 

(248) 
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Here, (246) follows beeause Q{-) is monotonieally deereasing. The inequality (248) implies that, 
if t < Tj — 2^ymrE\ogE, then 


Q 




t — rj 


< 0{E 




(249) 


a /^ / \yj2mT.E^ 

Proeeeding similarly as in (245)-(248), we ean show that (249) holds also if t > ri+2\/rri^K\ogE. 
Finally, if \t — ri\ < 2y/m^W\ogE, by the mean-value theorem [37, p. 107] there exists an 
Oo e [{t — rj)/ a/^, {t — v)/V 2m^] sueh that 




t — rj 

= IQ'(ao)| 
1 


t — rj 
^/2m^E 
rj t — rj 


1 


-a2/2 


■\/2mj.E 
t — rj 


\/2my.E 


o\ 


(t-vr 




log 77 
t — rj 


^/2rn^ 


O 


■ V' 

< const 


= oi 


log 77 


E 


Here, (251) and (252) follow beeause 

t — rj 


ao 


\/2m^ 


< 


t — rj t — rj 


a/^ ^/2mrE 
t — rj 


^/2mJ■E 


O 


log 77 


E 


log 77 


Combining (249) and (253), we eonelude that for every f G (0, cxd 

't — rj 


Q 




= Q 


t — rj 
\/2mT.E 


+ 0 


log 77 


(250) 

(251) 

(252) 

(253) 

(254) 

(255) 


(256) 


where the 0{^y{\ogE)/E) term is uniform in t. This means that replaeing the denominator 
in (243) with y/2m^E affeets the value of (243) only by (9 (a/ (log77)/77). Finally, we estab¬ 
lish (244) by using (256) in (243) and by normalizing u in (243) with respeet to 77. 

Lemma 17 below eharaeterizes the solution of the optimization problem in (244). 
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Lemma 17: Fix an arbitrary a > 0, 6 > 0, and G N. Let be i.i.d. CA/^(0,1)- 

distributed and let Z ~ 1^(0, 1) be independent of Then, we have 


inf P 

uGC' 


Il^‘ll2 = l 


\uiHr^i\^ — rrij-j < Z — b 


r=l i=l 

Proof: See Appendix IV-A. 

Using Lemma 17 in (244), we obtain 


= Q{h). 


q{E) = eE + 0\ 


logU 


E 


Finally, substituting (258) and (241) into (239), we eonelude that 


logM < m^EXoge — \/2m-rEQ ^(e^;) loge — log^e^ — e + 
= m^E log e — ■\/2mrEQ~^ ( ^ 


log^\ 

E 


'logE 

~¥ 


- log I + 


logU 


(257) 


(258) 


(259) 


(260) 


< rriy-E log e — y/ 2mrEQ ^(e) loge + - log77 + 0{\/logE). (261) 

Here, the last step follows by Taylor-expanding Q~^{-) around e, and by taking ci so that 




^ogE\ ^ llogE 


E 


E 


(262) 


This eoneludes the proof. 


A. Proof of Lemma 17 

First, eonsider the following sequenee of veetors indexed by N\ 

^^"^^ = ^[^^,0,...]. (263) 

N 

Evaluating the probability on the LHS of (257) for this sequenee of veetors, we establish the 
following upper bound 


mf P a 

uGC°°: 

||it|| 2 =i L ^ r=l 1=1 




TTlr < Z — b 


< lim P 

N^oo 

= Q{b). 


nir N 


“ N 


-mr) <Z-b 


r=l i=l 


(264) 

(265) 
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Here, the last step follows by the law of large numbers. 

Next, we prove the reverse inequality. Suppose that for every m eN and every u E C™ that 
satisfies ||w ||2 = 1, the following equality holds: 


inf P 

a>0 



< Z-b 


Q{b). 


Then, 


inf P 




r=l i=l 


inf P 

iaGC°°:||ia||2 = 1 


luiHr^il"^ - m,j < Z - b 

OO 

\ <Z-b 


m^a 


i=l 


> 


> 



<Z-b 

<Z-b 

< Z-b 


= Q{b). 


(266) 

(267) 

(268) 

(269) 

(270) 

(271) 

(272) 


Here, (268) follows beeause {Tfr,*} are independent and identieally distributed. This allows us 
to merge the double summation in (267) into one summation, provided that we account for the 
fact that each Ui must now multiply rrij. successive [Hi] (see the additional constraint on (268)). 
The inequality (269) follows by enlarging the feasible region of the minimization problem on 
the RHS of (268). 

We next prove (266). Through standard algebraic manipulations, it can be verified that (266) 
holds when m = 1. Fix now an arbitrary m>2 and an arbitrary u E C™ that satisfies ||w ||2 = 1- 
Assume without loss of generality that all entries of u are positive (otherwise just set m to be 
the number of positive entries in u). Let B = YlT=i let 

g{a) =P[a(5-l) < Z - b] = E[Q{aB - a + b)] . (273) 

Since g{0) = Q{b), it suffices to show that g{a) is nondecreasing on [0, oo), i.e., g'{a) > 0 for 
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all a e [0, oo). The derivative g'{a) is given by 

d 


9 (a) = — (EslQiaB - a + b)]^ 


t +2atb+b‘^ 

=e 2 fB-i{t)dt. 


(274) 

(275) 




J-i y/2jr 

Here, in (275) we used the Leibniz’s integration rule [38] and the identity Q'{x) = 

The RHS of (275) is equal to zero when a = 0 because, by definition, E[i? — 1] = 0. When 
a > 0, we have 

e-bV2 


g'ia) > 


.Ml 


7-1 

„-fe2/2 roo 


te 2 fB_i{t)dt 


.Ml 


ay/^ 7-1 

e-bV2 


e 2 /^_^(t)dt 




fs+Z/ai^)- 


(276) 

(277) 

(278) 


Here, (276) follows because e ““t < t for every t G M; in (277) we used integration by parts 


and that /b-i(—1) = 0. 

It remains to show that fB+Z/ a(l) < 0 for every a > 0. Since Z ~ A/'(0,1), by the central 
limit theorem for densities (see [39, Th. VIL2.7]), the pdf of Z can be approximated to an 
arbitrary precision by the pdf of a sum of i.i.d. Exp(l)-distributed random variables. Moreover, B 
is the convolution of finitely many exponential distributions and E[i? + Z/a] = 1. Hence, to 
prove f'B+z/ai^) ^ 0’ it suffices to show that the derivative of the convolution of finitely many 
exponential pdfs computed at the mean value of the resulting distribution is nonpositive. This 
follows from Lemma 15 (see Appendix II). 


Appendix V 
Proof of Theorem 11 


Let r; > 0 be an arbitrary constant and let the function g^(-) be defined as follows: 


qrj{x) = , X > 0. (279) 

It follows from (239), (240), and (243) that every (E, M, e)-code for the MIMO Rayleigh block¬ 
fading channel (12) for the case of perfect CSIR satisfies 


log M < rj log e — log 


inf 


E 


ueC°°:||u|||=E 




Qv 


^r=l 2=1 


— 6 


(280) 
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Figure 2. A geometric illustration of grj(-) in (279) (black curve), of the tangent line of grj(-) (blue curve), and of gn{-) in (86) 
(red curve). 


Suppose that the funetion defined in (86) is eonvex on [0,cx)), and that 

Qgix) > Qriix), for all X G [0, oo). (281) 


In other words, suppose that is a eonvex lower bound on qr^{x). Then, (87) follows beeause. 


for every u G C°° with ||w ||2 = E, 


E 


% EE \UiHr 


^r=l i=l 


> E 


9v EE \UiHr 


> 5'»?| E 

_r=l i=l 


r=l i=l 
rrif CO 




(282) 

(283) 

(284) 


Here, (283) follows from Jensen’s inequality. 

It remains to prove that gn{x) is indeed a eonvex lower bound on qri{x). Observe the following 
properties of qr){ ), whieh ean be verified through standard algebraie manipulations: 

• qrj{-) is monotonieally deereasing; 

• limqr^{x) = 1, lim qri{x) = 0; 

x—^oo 

• (iviv) = 1 / 2 ; 

. if 7) > TT, then q'^ig) = -l/(2^r)7f) < -l/{2g); 

• if ?7 > 6, there exists an 0 < xq < sueh that qri{-) is eoneave on (0,a;o) and eonvex on 
(xo,oo). 
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Assume that rj > 6. Then, the above properties of qr,{-) imply that there exists a unique Xi sueh 
that the line eonneeting (0,1) and {xi,qr^{xi)) lies below the graph of qri{x) and is tangent to 
qri{-) at {xi,qri{xi)) (see Fig. 2). Sinee the slope of the line eonneeting (0,1) and {xi,qrj{xi)) is 


Xi \ y/2xl ) 

and sinee the derivative of qri{x) dX x = Xi is given by 


(285) 


- (286) 

XlA^J'K \\/^ / 

it follows that xi is the solution of (85). Furthermore, sinee q'^{r]) < —l/{2ri), and sinee —l/{2ri) 
is the slope of the line eonneeting (0,1) and (r^, 1/2), we have that xi > rj. Observe now that 
gri{x) eoineides with the line eonneeting (0,1) and (xi, qr,{xi)) for x < xi, and that it eoineides 
with qrf{x) if X > Xi. This proves that gjj{x) is indeed a eonvex lower bound on g,,(x). 
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